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Section I 

INTRODUCTORY REMARKS 

Fara^ph 



Scope of this text I 

Mental equipment necessary for cryptanslytlc work 2 

Validity of results of oiyptanalyris. 8 



1. Scope of this text. — a. It is assumed that the student has studied the two preceding 
texts written by the same author and forming part of this series, viz, Elementary Military Cryp- 
tography, and Adxaneed Military Cryptography. These texts deal exclusively with cryptography 
as defined therein; that is, with the various types of ciphers and codes, their principles of con~ 
struction, and their employment in cryptographing and decryptographing messages. Particular 
emphasis was placed upon such means and methods as are practicable for military usage. It is 
also assumed that the student has firmly in mind the technically precise, special nomenclature 
employed in those texts, for the terms and definitions therein will all be used in the present text, 
with essentially the same significances. If this is not the case, it is recommended that the student 
review his preceding work, in order to regain a familiarity with the specific meanings assigned 
to the terms used therein. There will be no opportunity herein to repeat this information and 
unless he understands dearly the significance of the terms employed, his progress will be retarded. 

h. This text constitutes the first of a series of texts on cryptanalysis. Although most of the 
information contained herein is applicable to cryptograms of various types and sources, special 
emphasis will be laid upon the prindples and methods of solving military cryptograms. Except 
for an introductory discussion of fundamental principles underlying the science of cryptanaly tics, 
this first text in the series will deal soldy with the principles and methods for the analysis of 
monoalphabetic substitution dphers. Even with this limitation it will be possible tb discuss 
only a few of the many variations of this one type; but with a firm grasp upon the general prin- 
ciples no difficulties should be experienced with any variations that may be encountered. 

c. This and some of the succeedii^ texts will deal only with elementary types of cipher 
systems not because they may be encountered in military operations but because their study is 
essential to an understanding of the principles underlying the solution of the modem, very much 
more complex types of ciphers and codes that are likely to be employed by the huger govern- 
ments today in the conduct of their military affairs in time of war. 

d. All of this series of texts will deal only with the solution of visible secret writing. At 
some future date, texts dealing with the solution of invisible secret writing, and with secret 
signalling systems, may be prepared. 

2. Mental equipment necessary for cryptanalytic work. — a. Captain Parker Hitt, in the 
first United States Army manual * dealing with cryptography, opens the first chapter of his 
valuable treatise with the following sentence: 

Success in dealing with unknown dphers is measured by these four things in the order named: perseverance, 
careful methods of analysis, intuition, luck. 

‘ Hitt, Capt. Parker, Manual for the Solution of Military Ciphers, Army Service Schools Press, Fort Leaven- 
worth, Kansas, 1916. 2d Edition, 1918. (Both out of print.) 

( 1 ) 
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These words are as true today as they were then. There is no royal road to success in the 
solution of cryptograms. Hitt goes on to say: 

Cipher work will have little permanent attraction for one who expects results at once, without labor, for 
there is a vast amount of purely routine labor in the preparation of frequency tables, the rearrangement of 
ciphers for examination, and the trial and fitting of letter to letter before the message begins to appear. 

The present author deems it advisable to add that the kind of work involved in solving 
cryptograms is not at all similar to that involved in solving “cross-word puzzles”, for example. 
The wide vogue the latter have had and continue to have is due to the appeal they make to the 
quite common instinct for mysteries of one sort or another; but in solving a cross-word puzzle 
there is usually no necessity for performing any preliminary labor, and palpable results become 
evident after the first minute or two of attention. This successful start spurs the cross-word 
“addict” on to complete the solution, which rarely requires more than an hour’s time. Further- 
more, cross-word puzzles are all alike in basic principle and once understood, there is no more to 
learn. Skill comes largely from the embellishment of one’s vocabulary, though, to be sure, con- 
stant practice and exercise of the imagination contribute to the ease and rapidity with which 
solutions are generally reached. In solving cryptograms, however, many principles must be 
learned, for there are many different systems of varying degrees of complexity. Even some of 
the simpler varieties require the preparation of tabifiations of one sort or another, which many 
people find irksome; moreover, it is only toward the very close of the solution that results in the 
form of intel%ible text become evident. Often, indeed, the student will not even known whether 
he is on the right track until he has performed a large amount of preliminary “spade work” 
involving many hours of labor. Thus, without at least a willingness to pursue a fair amotmt of 
theoretical study, and a more than average amowni of paiience and perseverance, little skill and 
experience can be gained in the rathw difficult art of cryptanalysis. General Givierge, the author 
of an excellent treatise on cryptanalysis, remarks in this coimection: * 

The cryptanalyst’s attitude must be that of William the Silent: No need to hope in order to undertake, nor 
to aueeeed in order to persevere. 

(. As regards Hitt’s reference to careful methods of analysis, before one can be said to be a 
cryptanalyst worthy of the name it is necessary that one should have firstly a sound knowledge 
of the basic principles of cryptanalytis, and secondly, a long, varied, and active practical experi- 
ence in the successful application of those principles. It is not sufficient to have read treatises 
on t.bia subject. One month’s actual practice in solution is worth a whole year’s mere reading 
of theoretical principles. An exceedii^ly important element of success in solving the more 
intricate ciphers is ^e possession of the rather unusual mental faculty designated in general 
terms as the power of inductive and deductive reasoning. Probably this is an inherited rather 
than an acquired faculty; the best sort of training for its emergence, if latent in the individual, 
and for its development is the study of the natural sciences, such as chemistry, physics, biology, 
geology, and the like. Other sciences such as linguistics and philology are also excellent. Apti- 
tude in mathematics is quite important, more especially in the solution of ciphers than of codes. 

e. An active imagination, or perhaps what Hitt and other writers call intuition, is essential, 
but mere imagination imcontroUed by a judicious spirit will more often be a hindrance than a 
help. In practical cryptanalysis the imaginative or intuitive faculties must, in other words, be 
guided by good judgment, by practical experience, and by as thorough a knowledge of the general 
situation or extraneous circumstances that led to the sendii^ of the cryptogram as is possible 
to obtain. In this respect the many cryptograms exchanged between correspondents whose 
identities and general affairs, commerdal, social, or political, are known are far more readily 



* Qivierge, G4n€ral Marcel, Court it Cryptographie, Paris, 1925, p. 301. 
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solved than are isolated ciTptograins exchanged between unknown correspondents, dealing with 
unknown subjects. It is obvious that in the former case there are good data upon which the 
intuitive powers of the cryptamalyst can be brought to bear, whereas in the latter case no such 
data are available. Consequently, in the absence of such data, no matter how good the imaginar 
tion and intuition of the q^ptanalysl^ these powers are of no particular service to him. Some 
writers, however, regard the intuitive spirit as valuable from s^ another viewpoint, as may be 
noted in the following: • 

Intuition, like a flash of lightning, laeta only for a second. It generally comes when one is tormented by 
a difficult decipherment and when one reviews in his mind the fruitless experiments already tried. Suddenly 
the light breaks throi^ and one finds after a few minutes what previous days of labor were unable to reveal. 

This, too, is true, but unfortunately there is no way in which the intuition may he sum- 
moned at will, when it is most needed.* There are certain euthors who regard as indiq;>ensable 
the possession of a somewhat rare, rather mysterious faculty that they designate by .^e word 
“flair”, or by the expression “cipher brains.” Even so excdlent an authority as Gleneral 
Givier^,* in referring to this mental facility, uses the following words: “Over and above per- 
severance and this aptitude of mind which some auth<»B consider a spemal gift, and which &ey 
call intuition, or even, in its highest manifestation, clairvoyance, cryptographic studies will 
continue more and more to demand the qualities of orderliness and memory.” Although the 
present author believes a spedal aptitude for the work is essential to cryptanalytic success, he is 
sure there is nothing mysterious about the matter at all. Special aptitude is prerequisite to 
success in all fields of endeavor. There are, for example, tiiousands of physidste, hundreds of 
excellent ones, but only a handful of world-wide fame. Should it be said, then, that a physicist 

* Lange et Soudart, Traiii CryftographM, Librairie F41ix Alcan, Paris, 1626, p. 104. 

* The following extracts are of interest in this connection: 

The fact that the scientific investigator works 60 per cent of bis time by non-rational means is, it seems, quite 
insufficiently recognized. There is without the least doubt an instinct for research, and often the most successful 
investigators of nature are quite imable to give an account of their reasons for doing such and such an experi- 
ment, or for placing side by side two apparently unrelated facts. Again, one of the most salient traits in the 
character of the successful scientific worker is the capacity for knowing that a point is proved when it would not 
.appear to be proved to an outside intelligence functioning in a purely rational manner; thus the investigator 
feels that some proposition is true, and proceeds at once to the next set of experiments without waiting and wasting 
time in the elaboration of the formal proof of the point which heavier minds would need. Questionless such a 
scientific intuition may and does sometimes lead investigators astray, but it is quite certain that if they did 
not widely make use of it, they would not get a quarter as far as they do. Experiments confirm each other, and a 
false step is usually soon discovered. And not only by this partial replacement of reason by intuition does the 
work of science go on, but also to the born scientific worker — and emphaticaUy they cannot be made — the struc- 
ture of the method of research is as it were given, he cannot explain it to you, though he may be brought to agree 
a posHori to a formal logical presentation of the way the method works. — ^Excerpt from Needham, Joseph, 

Seeptiad Biologitt, London, 1929, p. 79. 

The essence of scientific method, quite mmply, is to try to see how data arrange themselves into causal 
configurations. Scientific problems are solved by collecting data and by “thinking about them all the time." 
We need to look at strange things until, by the appearance of known configurations, they seem familiar, and to 
look at familiar things until we see novel configurations which make them appear strange. We must look at 
events until they become luminous. That is scientific method . . . Insight is the touchstone . . . The appli- 
cation of insight as the touchstone of method enables us to evaluate properly the role of imagination in scientific 
method. The scientific process is akin to the artistic process: it is a process of selecting out those elements of 
experience which fit together and recombining them in the mind. Much of this kind of research is simply a cease- 
less mulling over, and even the physical scientist has considerable need of an armchair . . . Our view of scien- 
tific method as a struggle to obtain insight forces the admission that science is half art . . . Insigdit is the 
unknown quantity which has eluded students of scientific method. — Excerpts from an article entitled Inri^ht and 
Scientific Method, by Willard Waller, in The . American Journal of Sociology, VoL XL, 1934, 

* Op. eil., p. 302. 
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who has achieved very notable success in his fidd has done so because he is the fortunate posesssor 
of a mytitencnu faculty? That he is fortimate in possessing a spedal aptitude for his subject is 
granted, but that theie is anything mysterious about it, partaking of the nature of clairvoyance 
(if, indeed, the latter is a reality) is not granted. While the ultimate nature of any mental 
process seems to be as complete a m 3 rstery today as it has ever been, the present author would 
like to see the superficial veil of mystery removed from a subject that has been dirouded in 
mystery from even before the Middle down to our own times. (The principal and easily 
understandable reason for this is that governments have always closely guarded cryptographic 
secrets and anything so guarded soon becomes “mysterious.”) He would, rather, have the 
student approach the subject as he might approach any other science that can stand on its own 
merits with other sciences, because cryptanalytics, like other sciences, has a practical importance 
in hmnan affairs. It presents to the inquiring mind an interest in its own right as a branch of 
knowledge; it, too, holds forth many difficulties and disappointments, and these are all the more 
keoily felt when the nature of these difficulties is not understood by those unfamiliar with the 
sj>ecial drcumstances that very often are the real factors that led to success in other oases. 
Finally, just as in the other sciences wherein many men labor long and earnestly for the true 
satisfaction and pleasure that comes from work well-done, so the mental pleasure that the 
successful cryptanalyst derives from his accomplishments is very often the only reward for much 
of the drudgery that he must do in his daily work. General Givierge’s words in this connection 
are well worth quoting:* 

Some studies will last for years before bearing fruit. In the case of others, cryptanalysts undertaking them 
never get any result. But, for a cryptanalyst who likes the work, the joy of discoveries effaces the memory of his 
hours of doubt and impatience. 

d. With his usual deft touch, Hitt says of the element of luck, as regards the role it plays in 
analyms: 

As to luck, there is the old miners' proverb: "Gold is where you find it.” 

The cryptanalyst is lucky when one of the correspondents whose ciphers he is studying 
makes a blunder that gives the necessary due; or when he finds two cryptograms identical in 
text but in different keys in the same system; or when he finds two cryptograms identical in 
text but in different systems, and so on. The dement of luck is there, to be sure, hut the crypt- 
analyst must be on the alert if he is to profit by these lucky “breaks.” 

e. If the present author were asked to state, in view of the progress in the fidd since 1916, 
what dements might be added to the four ingredients Hitt thought essential to cryptanalytic 
success, he would be inclined to mention the following: 

(1) A broad, general education, embodying interests covering as many fields of practical 
knowledge as possible. This is useful because the cryptanalyst is often called upon to solve 
messages dealing with the most varied of human activities, and the more he knows about these 
activities, the easier his tadr. 

(2) Access to a large library of current literature, and wide and direct contacts with soru'ces 
of collateral information. These often afford dues as to the contents of specific messages. For 
example, to be able instantly to have at his disposal a newspaper report or a personal report of 
events described or referred to in a message imder investigation goes a long way toward simpli- 
fying or facilitating solution. Government cryptanalysts are sometimes fortunatdy situated in 
this respect, especially where various agendas work in harmony. 

(3) Proper coordination of effort. This indudes the organization of cryptanalytic personnd 
into harmonious, effident teams of cooperating individuals. 



* Op. eit., p. 301. 
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(4) Under mental eqtdpment he would also indude the faculty of bdng able to eonoditrate 
on a problem for rather long periods of time, without distraction, nervous irritability, and 
impatience. The strain under whidi cryptanalytic studies are necessarily ccmducted is quite 
severe and too long-continued application has the effect of draining nervous energy to an 
unwholesome degree, so that a word or two of caution may not here be out of place. One should 
continue at work only so long as a peaceful, calm spirit prevails, whether the work is fruitful or 
not. But just as soon as the mind becomes wearied with tiie exertion, or just as soon as a feeling 
of hopdeesness or mental fatigue intervenes, it is better to stop oompletdy and turn to other 
activities, rest, or play. It is essential to remark that systematization and orderliness of woric 
are aids in reducing nervous tendon and irritability. On this account it is better to take the 
time to prepare the data carefully, rewrite the text if necessary, and so on, rather than work 
with slipshod, incomplete, or improperly arranged material. 

(5) A retentive memory is an important asset to ciyptanal]rtic skill, etfpedaUy in the solu- 
tion of codes. The ability to remember individual groups, their approximate locations in other 
messages, the assodations they form with other groups, thdr peculiarities and sunilarities, saves 
much wear and tear of the mental machinery, as well as mudr time in looking up these groups in 
indexes. 

y. It may be advisable to add a word or two at this point to prepare the student to expect 
dight mental jars and tendons which will almost inevitably come to him in the consdentioiu 
study of this and the subsequent texts. The present author is wdl aware of the complaint of 
students that authors of texts on cryptanalysu base much of thdr explanation upon thdr fore- 
knowledge of the “answer” — ^idiich the student does not know while he is attempting to follow 
the solution with an unbiased mind. They complain too that these authors use such expresdons 
as “obvioudy”, “naturally”, “of course”, “it is evident that”, and so on, when the drcumstances 
seem not at all to warrant thdr use. There is no question but that this sort of treatment is apt 
to discourage the student, espedally when the point duddated becomes clear to him only after 
many hours’ labor, whereas, according to the book, the author noted the weak spot at the first 
moment’s inspection. The present author can only promise to try to avoid making the steps 
appear to be much more simple than they really are, and to suppress faring instances of unjusti- 
fiable “jumping at condudons.” At the same time he must indicate that for pedagogical reasons 
in many cases a message has been conscioudy “manipulated” so as to allow certain prindples to 
become more obvious in the illustrative examples than they ever are in practical work. During 
the course of some of the explanations attention will even be directed to cases of imjustified 
inferences. Furthermore, of the student who is quick in observation and deduction, the author 
will only ask that he bear in mind that if the duddation of certain prindples seems prolix and 
occupies more space than necessary, this is occadoned by the author’s desire to carry the 
explanation forward in very diort, eaeily-comprdiended, and plainly-described steps, for the 
benefit of students who are perhaps a bit dower to grasp but who, once they understand, are 
able to retain and apply principles dowly learned just as weQ, if not better then the students 
who learn more quickly. 

3. Validity of results of cryptanalysis. — Valid, or authentic cryptanalytic solutions cazmot 
and do not represent “opinions” of the cryptanalyst. They are valid only so far aa they are 
wholly objective, and are susceptible of demonstration and proof, employing authentic, objective 
methods. It diould hardly be necessary (but an attitude frequently encountered among laymen 
makes it advisable) to in^cate that the validity of the results achieved by any serious crypt- 
analytic studies on authentic material rests upon the same sure foundations and are reached by 
the same general steps as the results achieved by any other scientific studies; viz, observation, 
hypothesis, deduction and induction, and confirmatory experiment. Implied in the latter is the 
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possibility that two or more qualified inTestigators, each working independently upon the same 
material, will adhieye ideutical (or practically identical) results. Occauonally a pseudo-crypt- 
analyst offers “solutions” which cannot withstand such testa; a second* unbiased, investigator 
wor^g independently mther cannot eonaistenUy apply the methods alleged to have been applied 
by the pseudo-cryptanalyst, or else, if he can ap^dy them at all, the results (plain-text transla- 
tions) ate far different in the two cases. The reason for this is that in sudi cases it is generally 
formd that the “methods” are not clear-cut, strai^tforward or mathematical in character. 
Instead, they often involve the making of judgments on matters too tenuous to measure, weigh, 
or otherwise subject to careful scrutiny. Li such cases, the conclusion to which the unprejudiced 
observer is forced to come is that the alleged “solution” obtained by the first investigator, the 
pseudo-cryptanalyst, is purely subjective. In nearly all cases where this has happened (and they 
occur from time to time) there has been uncovered nothing which can in any way be used to 
impugn the integrity of the pseudo-cryptanalyst. The worst that can be said of him is that he 
has become a victim of a special or peculiar form of self-delusion, and that his desire to solve the 
problem, usually in accord with some previously-formed opinion, or notion, has over-balanced, 
or undermined, his judgment and good sraae.^ 

. * Specific reference can be made to the following typical "case histories”: 

.DonneUy, Ignatius, The Cheat Cryptogram. Chicago, 1888. 

OWeh, Orville W., Sir Franeie Baeon'e Cipher Story. Detroit, 1895. 

Qallup, Elisabeth Wells, Francis Baeon'e BUitend Cipher. Detroit, 1900. 

liargoliouth, D. 8 .. The Homer of Arielotle. Oxford, 1923. . . 

Newbold, William Bomaine, The Cipher of Roger Bacon. Philadelphia, 1928. (For a scholarly and 
complete demolition of Professor Newbold’s work, see an article entitled Roger Bacon and 
the Voynich MS, by John M. Manly, in Speculum, Vol. VT, Xo. 3, July 1931.) 

Aronsberg, Walter Conrad, TTie o/ iShafeesTiears. Los Angeles, 1922. 

The Shakespearean Mystery. Pittsburgh, 1928. 

The Baconian Keys. Pittsburgh, 1928. 

Teely, Joseph Martin, The Shakespmrean Cypher. Rochester, X. Y., 1931. 

Bedphering Shakespeare. Rochester, X. Y., 1934. 
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4. The four basic operations in cryptanalysis. — a. 


The solution of practically every ciypto- 



gram inyolres four fundamental operations or steps: 

(1) The determination of the language employed in the plain-text version. 

(2) The determination of the general system of cryptography employed. 

(3) The reconstruction of the specific key in the case of a cipher system, or the reconstruc- 
tion, partial or complete, of the code book, in the case of a code system; or both, in the case of an 
enciphered code system. 

(4) The reconstruction or establishment of the plain text. 

b. These operations will be taken up in the order in which they are given above and m which 
they usually are performed in the solution of cryptograms, although occasionally the second 
step may precede the first. 

6. The determination of the language employed. — a. There is not much that need be said 
with respect to this operation except that the determination of the lat^age employed seldom 
comes into question in the case of studies made of the cryptograms of an oi^anized enemy. 
By this is meant that during wartime the enemy is of course known, and it follows, therefore, 
that the language he employs in his messages will almost certainly be his native or mother tongue. 
Only occasionally nowadays is this rule broken. Formerly it often happened, or it might have 
indeed been the general rule, that the language used in diplomatic correspondence was not the 
mother tongue, but French. In isolated instances during the World War, the Germans used 
English when their own language could for one reason or another not be employed. For example, 
for a year or two before the entry of the United States into that war, during the time America 
was neutral and the Cterman Government maintained its embassy in Washington, some of the 
messages exchanged between the Foreign Ofiice in Berlin and the Embassy in Wafiiington were 
ciyptographed in English, and a copy of the code used was deposited with the Department of 
State and our censor. Another instance is found in the case of certain Hindu conspirators who 
were associated with and partially financed by the German Government in 1915 and 1916; they 
employed English as the language of their cryptographic messages. Occasionally the crypto- 
grams of enemy agents may be in a language different from that of the enemy. But in general 
these are, as has been said, isolated instances; as a rule, the language used in cryptograms ex- 
changed between members of large organizations is the mother tongue of the correspondents. 
Where this is not the case, that is, when cryptograms of unknown origin must be studied, the 
cryptanalyst looks for any indications on the cryptograms themselves which may lead to a 
conclusion as to the language employed. Address, signature, and plain-language words in the 
preamble or in the body of the text all come under careful scrutiny, as well as all extraneous 
circumstances connected with the manner in which the cryptograms were obtained, the person 
on whom they were found, or the locale of their origin and destination. 

( 7 ) 
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6. In spedal cases, or under special circumstances a clue to the language employed is found 
in the nature and composition of the cryptographic text itself. For example, if the letters K and 
W are entirely absent or appear very rarely in messages, it may indicate that the language is 
Spanish, for these letters are absent in the alphabet of that language and are xised only to spell 
foreign words or names. The presence of accented letters or letters marked with special signs of 
one sort or another, peculiar to certain languages, will sometimes iadicate the language used. 
The Japanese Morse telegraph alphabet and the Russian Morse telegraph alphabet contain 
combinations of dots and dashes which are peculiar to those alphabets and thus ^e interception 
of messages containing these spedal Morse combinations at once indicates the language involved, 
finally, there are certain peculiarities of alphabetic languages which, in certain types of crypto- 
grams, viz, pure transposition, give dues as to the language used. For example, the frequent 
digraph C H, in German, leads to the presence, in cryptograms of the type mentioned, of many 
isolated C's and H's; if this is noted, the cr 3 rptogram may be assumed to be in German. 

e. In some cases it is perfectly possible to perform certain steps in cryptanalyds b^ore 
the language of the cryptogram has been definitdy determined. Frequency studies, for example, 
may be made and analytic processes performed without this knowledge, and by a cryptanalyst 
wholly imfamiliar with the language even if it has been identified, or who knows ody enough 
about the language to enable him to recognize valid combinations of letters, syllables, or a few 
common words in that language. He may, after this, call to his assistance a translator who may 
not be a cryptanalyst but who can materially aid in making necessary assumptions based upon 
his special knowledge of the characteristics of the language in question. Thus, cooperation 
between cryptanalyst and translator results in solution.' 

6. The determination of the general system. — a. Except in the case of the more simple 
types of cryptograms, the determination of the general system according to which a given crypto- 
gram has been produced is usually a difficult, if not the most difficult, step in its solution. The 
reason for this is not hard to find. 

b. As will become apparent to the student as he proceeds with his study, in the final analysis, 
the aduHon of every eryptoffram irmlimg a form of substitution depends upon its reduction to mono- 
alphabetic terms, ^ it is not originally in those terms. This is true not only of ordinary substitution 
ciphers, but also of combined substitution-transposition ciphers, and of endphered code. If the 
cryptogram must be reduced to monoalphabetic terms, the manner of its accomplishment is 
usually indicated by the cryptogram itself, by external or internal phenomena which become 
apparent to the cryptanalyst as he studies the cryptogram. If this is impossible, or too difficult 
the cryptanalyst must, by one means or another, discover how to accomplish this reduction, 
by bringing to bear all the special or collateral information he can get from all the sources at his 
command. If both these possibilities fail him, there is little left but the long, tedious, and often 
fruitless process of elimination. In the case of transposition ciphers of the more complex type, 
the discovery of the basic method is often simply a matter of long and tedious elimination of 
possibilities. For cryptanalysis has unfortimately not yet attained, end may indeed never 
attain, the precision found today in qualitative analysis in chemistry, for example, where the 
analytic process is absolutdy dear cut and exact in its dichotomy. A few words in explanation of 
what is meant may not be amiss. When a chemist seeks to determine the identity of an unknown 

> The writer has seen in print statements that "during the World War . . . decoded messages in Japanese 
and Bussian without knowing a word of either language." The extent to which such statements are exaggerated 
will soon become obvious to the student. Of course, there are occasional instances in which a mere clerk with 
quite limited experience may be able to "solve” a message in an extremely simple system in a language of which 
he has no knowledge at all; but such a "solution” calls for nothing more arduous than the ability to recognize 
pronounceable combinations of vowels and consonants — an ability that hardly deserves to be rated as "crypt- 
analytic” in any real sense. To say that it is possible to solve a cryptogram in a foreign language "without 
knowing a word of that language” is not quite the same as to say that it is possible to do so with only a slight 
knowledge of the language; and it may be stated without cavil that the better the cryptanalyst’s knowledge of 
the language, the greater are the ehanoes for his success and, in any case, the easier is his work. 
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substance, he applies certain spedfic reagents to the substance and in a spedfic sequence. The 
first reagent tdls him definitely into which of two primaiy dasses the unknown substance falls. 
He then applies a second test with another specific reagent, which tells him again quite definitely 
into which of two secondary classes the unknown substance falls, and so on, until fiqally he has 
reduced the unknown substance to its simplest terms and has found out what it is. In striking 
contrast to this situation, cryptanalysis affords exceedingly few “reagents” or tests that may be 
applied to determine positiTely that a given dpher belongs to one or the other of two qratems 
yielding externally similar results. And this is what makes the analysis of an isolated, cmnplex 
cryptogram so difficult. Note the limiting adjective “isolated” in the foregoing sentence, for it 
is used advisedly. It is not often that the general system fails to disdose itself or cannot be 
discovered by painstaking investigation when there is a great vdume d text accumulating from 
a regular traffic between numerous correspondents in a large organization. Sooner or later the 
lystem becomes known, either because of blunders and cardessness on the part of the peraoimel 
entrusted with the cryptqgraphing of the messages, or because the accumulation of text itself 
makes possible the determination of the general system by cryptanalytio studies. But in the 
case of a.single or even a few isolated cryptograms concerning which little or no iitformation can 
be gamed by the cryptanalyst, he is often unable, without a knowledge of, or a shrewd guess as to 
the general system employed,, to decompose the heterogeneous text of the c^tog^ain into 
homogeneous, monoalphabetic text, which is the ultimate and essential step in analysis. The 
only knowledge that the cryptanalyst can bring to his aid in this most difficult step is that gained 
by loi^ eq>etience and practice in the analysis of many different types of systems. 

e. On account of the complexities surrounding this particular phase of cryptanalysis, and 
because in any scheme of analysis based upon successive eliminations of alternatives the crypt* 
analyst can only progress so far as the extent of his own knowledge of aU the possible alternatives 
will permit, it is necessary that detailed discusdon of the eliminative process be postponed until 
the student has covered most of the field. For example, the student will perhaps want to know 
at once how he can distinguish between a cryptogram that is in code or enciphered code from one 
that is in cipher. It is at this stage of his studies impracticable to give him any helpful indica- 
tions on his question. In return it may be asked of him why he should expect to be able to do 
this in the early stages of his studies when often the experienced e3q>ert cryptanalyst is baffled on 
the same scorel 

d. Nevertheless, in lieu of more precise tests not yet discovered, a general guide that may be 
useful in cryptanalysis will be built up, step by step as the student progresses, in the form of a 
series of charts comprising what may be designated An Analytical Key For Oryptanalyeie. (See 
Par. 50.) It may be of assistance to the student if, as he proceeds, he wiU carefully study the 
charts and note the place whi(di the particular cipher he is solving occupies in the general oiypt- 
analytic panorama. These dxarts admittedly constitute only very brief outlines, and can 
therefore be of but little direct assistance to him in the analysis of ^e more complex types of 
ciphers he may encounter later on. So far as they go, however, they may be found to be quite 
useful in the study of elementary cryptaiuilysis. For the experienced cryptanalyst they can 
serve only as a means of assuring that no possible step or process is inadvertently overlooked in 
attempts to solve a difficult mpher. 

e. Much of the labor involved in cryptanalytic work, as referred to in Par. 2, is connected 
with this determination of the general system. The preparation of the text, its rewriting in 
different forms, sometimes being rewritten in a half dozen ways, the recording of letters, the 
establishment of frequencies of occurrences of letters, comparisons and experiments made with 
known material of similar character, and so on, constitute much labor that is most often in- 
dispensable, but which sometimes turns out to have been wholly unnecessary, or in vain. In a 
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ireceat treatise * it is stated quite boldly that “this work once done, the determination of the 
system is often relatively easy.” This statement can certainly apply only to the simpler types of 
ciphers; it is entirdy misleading as regards the much more frequently encountered complex 
cryptograms of modem times. 

7. The reconstruction of the spedfio key. — a. Nearly all practical cr 3 rptographic methods 
require the use of a specific key to guide, control, or modify the various steps under the general 
system. Once the latter has been disclosed, discovered, or has otherwise come into the possession 
of the cryptanalyst, the next step in solution is to determine, if necessary, and if possible, the 
spedfic key that was employed to cryptograph the message or messages under examination. 
This determination may not be in complete detail ; it may go only so far as to lead to a knowledge 
of the number of alphabets involved in a substitution cipher, or the number of columns involved 
in a transposition cipher, or that a one-part code has been used, in the case of a code system. 
But it is often desirable to determine the specific key in as complete a form and with as much 
detail as possible, for this information will very frequently be useful in the solution of subsequent 
cryptograms- exchanged between the same correspondents, since the nature of the specific key 
in a solved case may be expected to give dues to the specific key in an unsolved case. 

h. Frequently, however, the reconstruction of the key is not a prerequisite to, and does not 
constitute an absolutely necessary preliminary step in, the fourth bade operation, viz, the recon- 
struction or establishment of the plain text. In many cases, indeed, the two processes are 
carried along simultaneously, the one assisting the other, until in the final stages both have been 
completed in their entireties. In still other cases the reconstruction of the specific key may 
succeed instead of precede the reconstruction of the plain text, and is accomplished purely as a 
matter of academic interest; or the specific key may, in unusual cases, never be reconstructed. 

8. The reoonstruotion of the plain text. — a. little need be said at this point on this phase 
of cryptanalyds. The process usually condsts, in the case of substitution ciphers, in the estab- 
lishment of equivalency between specific letters of the cipher text and the plain text, letter by 
letter, pair by pair, and so on, depending upon the particular type of substitution system 
' involved. In the case of transpodtion dphers, the process consists in rearranging the elements of 

the cipher text, letter by letter, pair by pair, or occadonally word by word, depending upon the 
particular type of transpodtion system involved, until the letters or words have been returned 
to their original plain-text order. In the case of code, the process consists in determining the 
meaning of ea<^ code group and inserting this meaning in the code text to reestablish the original 
plaintext. 

b. The forcing processes do not, as a rule, begin at the banning of a message and 
continue letter by letter, or group by group in sequence up to the very end of the message. The 
‘ esteblishment of vdues of cipher letteis in substitution methods, or of the podtions to which 
cipher' letters should be transferred to form the plain text in the case of transpodtion methods, 
comes at very irregular intervals in the process. At first only one or two values scattered here 
and tirere thiUUghout the text may appear; these then form the “skeletons” of words, upon which 
further weak, a continuation of the reconstruction process, is made posdble; in the md the 
complete or nearly complete ’ text is established. 

e. In the case of cr 3 rptograms in a foreign language, the trandation of the solved messages 
is a final and necessary step, but is not to be conddered as a cryptanalytic process. However, 
it is commonly the case that the trandation process will be carried on simultaneoudy with the 
cryptanalytic, and wiU aid the latter, especially when there are lacunae which may be filled in 
from the context. (See also Par. 5e in this connection.) 

* I^ge et Soudart, op. cU., p. 106. 

‘ * SbmetimeB in the ease of code, the meaning of a few code groups may be lacking, because there is insufficient 
text to establish their meaning. 
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FREQUENCY DISTRIBUTIONS 



Fangmph 

Th« siin^ or uniliteral frequenoy diatributloiL. 9 

Important featurea of the normal iiniliteral frequenoy diatribution lO 

Gonatanoy of the atanda:^ or nonnal uniliteral frequenoy diatribution 11 



9. The simple or niiiliteral frequency distribution.— a. It has long been known to. cryptog- 
raphers and typographers that the letters composing the words of any intelligible writt^ text 
composed in any language which is alphabetic in construction are eupployed with greatly varying 
frequencies. For example, if on cross-sectipn paper a simple tabulation, shown in Fig. 1, called a 
umliteral frequency distriJnition, is made of the letters composing the words of the preceding sen- 
tence, the variation in frequency is strikingly demonstrated. It is seen that whereas certain 
letters, such as A, E, I, N, 0, R, S, and T, are employed very frequently, other letters, such as 
C, G, P, and W are employed not nearly so frequently, while stifi other letters, such as F, J, Q, V, 
and Z are employed either seldom or not at 

■ ■ !a . ■ ^ 

2 ^ 2 

I ^ " 2 I " 1 ^ ^22 

A B C D I F I H I J K L M N 0 P Q I I I U V I X I Z 

14 . > 8 4 22 3 B 10 U 0 . 1 B S 17 14 8 1 13 10 30 3 1 t 1 7 0 

(Total=200 letters) 

:7i0Tnal. 

If a similar tabulation is now made of the letters comprising the words of the second 
senteiice in the preceding paragraph, liie graph shown in Fig. 2 is obtained^ Both sentences 
have exactly the same number of letters (200). 

: ^ 2 2 22 s22 

2s% 2 5= 2 222^g22 

ABCDEF6HIJKLMN0PQRSTUVWXYZ 

1328726748200108 17 14 8 318 14 17 812180 

(Total=200 letters) 

itetmi 2. 

ci Although each of these two graphs exhibits great variation in the relative frequencies 
. with which different letters are employed in the respective sentences to which they apply, no 
marked differences are exhibited between the frequencies of the same letter in the two graphs. 
Compare, for example, the frequencies of A, B, C . . . Z in Fig. 1 with those of A, B, C, . . . Z 
yiii Fig. 2., Aside from one or two exceptions, as in the case of the letter F, these two graphs .agree 
rather strikingly. 



( 11 ) 
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d. This agreement, or timilarUy, would be practically complete if the two texts were much 
longer, for example, five times as long. In fact, when two texts of similar character, each con- 
taining more than 1,000 letters, are compared, it would be found that the respective frequencies 
of the 26 letters composing the two graphs ^ow only very slight differences. This means, in 
other words, that in nonnal text each letter of the ^phabet occurs with a rather eonstant or 
eharaeteristie fregymey which it tends to approximate, depending upon the length of the text 
analyzed. The longer the text (within certain limits), the closer will be the approximation.^ 

e. An experiment aloi^ these lines will be convincing. A series of 260 official tel^ams * 
passing through the War Department Message Center was examined statistically. The mes- 
sages were divided into five sets, each totaling 10,000 letters, and the five distributions shown 
in Table 1-A, were obtained. 

y. If the five distributions in Table 1-A are sununed, the results are as shown in Table 2-A. 



Tablb 1-A. — Abaohde fregysneiea of Utters appearing in fioe seta of Oooemtnenial plain4ext tde- 
groma, eadh, set eofttaining lOfiOO Utters, arranged alphabeHeeMy 



MM(sNa.l 


MmicsKo.3 


Meswis No. 8 


MmscsKo.4 


HaaavsNo.S 


tetlM 


^btdhite 


littUr 


Abtointe 


latter 


Abaolnts 

Jattmaar 


Uttar 


Abednte 

Vnqnaiier 


Uttar 


AbMiaU 

qUMMT 




A 


788 


A._ . 


783 


k 


681 


k 


740 


4 


741 


R 


104 


B 


108 


B 


08 


B.. 


83 


B 


00 


r. 


810 


C- 


300 


C 


288 


e 


326 


e 


801 


n 


887 


D_ 


413 


D 


423 


D___ 


461 


D 


448 


K 


1, 867 


E 


1, 204 


E 


1,292 


E 


1, 270 




1, 276 


V 


268 


P__. 


287 


P 


808 


p 


287 


p 


281 


R 


166 


C 


176 


G 


161 


a 


167 


G 


160 


H 


310 


H 


861 


H 


886 


H 


849 


H 


840 


T 


742 


T 


760 


T 


787 


T 


700 


I 


607 


.T 


18 


J 


17 


J 


10 


J _ 


21 


J 


16 


K 


36 


K 


88 


K _ 


22 


K _ 


21 


K._ 


31 


t. 


866 


I. 


803 


T. 


mm 


I. 


mm 


L 


844 


U 


242 


M_ 


240 


It- 


288 


■ 


240 1 


II 


268 


M 


786 


N 


704 


N 


815 


M _ 






780 


0 


086 


0_ 


770 


0 


701 


0 _ _ . . 


766 


0 


762 


p . 


241 


P__ 


272 


P 


817 


P_ 


245 


P 


200 


o 


40 


q 


22 


q 


46 


q 




0 


80 


R 


760 




746 


R 


762 


R __ 


786 


R 


786 


S 


668 


s 


683 


.R 


686 


S . 


628 


S 


604 


T 




T 


870 


T 


894 


T___ 


968 


T . 


028 


It 


270 


a___ - , 


288 


U 


812 


U_ _ 


247 


U. 


238 


V 


163 


V 


178 


V 


142 


V___ 


183 


V 


156 


■ .. 


166 


« . 


168 


w_. 


186 


w 


188 


w.... .. 


182 


T 


43 


X . 


60 


X 


44 


r 


53 




41 


V 


101 


Y 


166 


Y 


170 


Y .. 


213 


Y 


220 


z . 


14 


7 


17 


z_ 


2 


7. 


11 


Z 


6 






















Total 


lOj 000 




10, 000 




10, 000 




10, 000 




10, 000 








BBB 











< See foofaiote 5, page 10. 

■ These eomprbed meaaagee from several d^mrtments in addition to the War Department and were all of 
an administrative character. 
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Tabl^ 2-A. — Abadvte fregueneies of letters appearing in the ermbvnedfioe sets of tnessa^ totaling 

SOfido letters, arranged alphaletieally 



A. 


3,683 


G. 


819 


T. 


1 ,821 


0 175 


V. 


766 


B. 


487 


R..... 


1,694 


M. 


1 ,237 


R 3,788 


w. 


780 


C. 


1,534 


I_ ... 


3,676 


R. „ 


3 ,975 


S. 3,058 


X. 


231 


D 


2, 122 


J 


82 


0. 


3 ,764 


T. 4,595 


Y. 


967 


E. 


6,498 


K. 


148 


P. 


1 ,335 


U- 1,300 


Z. 


49 



F. 1,416 



g. The frequencies noted in subparagraph/, when reduced to the basis of 1,000 letters and 
then used as a basis for constructing a simple chart that will eilubit the variations in frequency 
in a striking manner, yidd the following graph which is hereafter deagnated as the normal, or 
standard ‘mUiteral frequency distribution for English telegraphic plain text: 




ABCDEFGHIJKLUNOPQRSTUVWXYZ 



74 10 31 42 130 28 U 34 74 2 3 30 2S 70 76 27 3 70 01 92 23 IS 10 S 10 1 

XlOOBO 3. 



10. Important features of the normal uniliteral frequency distribution. — a. When the graph 
shown in Fig. 3 is studied in detail, the following features are apparent: 

(1) It is quite irregular in appearance. This is because the letters are used with greatly 
varying frequencies, as discussed in the preceding paragraph. This irregular appearance is often 
described by saying that the graph shows marked crests and troughs, that is, points of high fre- 
quency and low frequency. 



148274—38 2 
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(2) The relative positions in which the crests and troughs fall within the graph, that is, the 
spaficd relations of the crests and troughs, are rather definitely fijced and are determined by cir- 
cumstances which have been explained in a preceding text.’ 

(3) The relative heights and depths of the crests and troughs within the graph, that is, the 
linear extensions of the lines marking the respective frequencies, are also rather definitely fixed, 
as would be found if an equal volume of similar text were analyzed. 

(4) The most prominent crests are marked by the vowels A, E, I, 0, and the consonants 
N, R, S, T; the most prominent troughs are marked by the consonants J, K, Q, X, and Z. 

(6) The important data are summarized in tabular form in Table 3. 



Table 3 





Frequency 


Percent of 
total 


Percent of 
total in 
round 
numbers 


fi Vowels; A E I 0 11 Y 


398 


30.8 


40 


20 Consonants: 

fi TTigb Frequency (D N R 5 T) 


350 


35.0 


35 


10 Medium Frequency (B C P G H L M P V W) .. . 


238 


23.8 


24 


a TjOw Frequency (.J K Q X Z) 


14 


1.4 


1 






Total . . _ . 


1,000 


100.0 


100 





(6) The frequencies of the letters of the alphabet are as follows: 








A 74 


G 


- 16 


L 


36 


q 


3 


V 


15 


B. 10 


H 


34 


It 


25 


R 


76 


W 


16 


C 31 


I. 


.. 74 


N... 


79 


S- 


61 


X. 


5 


D 42 


J 


2 


0.„ 


75 


T. 


92 


y. 


19 


E l.SO 


K 


3 


P .. . 


27 


U 


26 


z 


1 


F. 28 


















(7) The relative order of frequency of the letters is as follows: 








E. 130 


T 


... 74/ 


c ... 


31 


Y 


19 


X 


5 


T 92 


S 


... 61 


p 


28 


G 


16 


q 


3 


N 79 


D 


... 42 


p .. 


27 


W 


16 


K 


3 


R 7fi 


L 


... 36 


u 


26 


V 


15 


.1 


2 


0. 75 


It 


... 34 


M. 


25 


B. 


10 


z, 


1 


A. 74 



















(8) The four vowels A, E, I, 0 (combined frequency 353) and the four consonants N, R, S, T 
(combined frequency 308) form 601 out of every 1,000 letters of plain text; in olher words, lesi 
than % of the alphabet is employed in vmting % of normal plain text. 



* Section VII, Elementary Military Cryptography. 
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b. The data given in Fig. 3 and Table 3 represent the relative frequencies found in a large 
volume of English telegraphic text of a governmental, administrative character. These fre- 
quencies will vary somewhat with the nature of the text analyzed. For example, if an equal 
number of telegrams dealing solely with commercial transactions in the leather industry were 
studied statistically, the frequencies would be slightly different because of the repeated occurrence 
of words peculiar to that industry. Again, if an equal number of telegrams dealing solely with 
military messages of a tactical character were studied statistically, the frequencies would differ 
slightly from those found above for general governmental messages of an administrative character. 

c. If ordinary Engh^ literary text (such as may be found in any book, newspaper, or printed 
document) were analyzed, the frequencies of certain letters would be changed to an appreciable 
degree. This is because in telegraphic text words which are not strictly essential for intelligibility 
(such as the definite and indefinite articles, certain prepositions, conjunctions and pronouns) are 
omitted. In addition, certain essential words, such as “stop”, “period”, “comma”, and the like, 
which are usually indicated in written or printed matter by symbols not easy to transmit tele- 
graphically and which must, therefore, be speUed out in telegrams, occur very frequently. Fur- 
thermore, telegraphic text often employs longer and mpre uncommon words than does ordinary 
newspaper or book text. 

d. As & matter of fact, other tables compiled in the Office of the Chief Signal Officer gave 
slightly different results, depending upon the source of the text. For example, three tables based 
upon 75,000, 100,000, and 136,257 letters taken from various sources (telegrams, newspapers, 
magazine articles, books of fiction) gave as the relative order of frequency for the first 10 letters 
the following: 

For 75,000 letters. ETRNIOASDL 

For 100,000 letters—. ETRINOASDL 

For 136,267 letters ETRNAOISLD 



Table 4. — Frequency table jor lOfiOO letters oj literary English, as compiled by HUt 

ALPHABETICALLY ARRANGED 



A. ... 


778 


G.. 


174 


L.___ 


372 


q 


8 


V. 


112 


B 


141 


H.. 


595 


IL 


288 


R. 


651 


W 


176 


C. 


296 


I.. 


667 


N. 


686 


S 


622 


Y 


27 


D 


402 


J.. 


.11 


0 ___ 


807 


T 


855 


Y 


196 


V. 


1 ,277 


K.. 


74 


P_ 


223 


U_ 


308 


7 


17 


F. 


197 




ARRANGED ACCORDING TO FREQUENCY 







E. 1 ,277 R. 651 U. 308 Y. 196 K. 74 

T 855 S 622 C 296 W. 176 J 51 



0 


807 


H 


... 595 


M 


... 288 


a 


. 174 


XL_ 


-. 27 


A 


778 


D 


... 402 


P _ 


... 223 


B . 


141 


2L_ _ 


.. 17 


N. 


686 


L. 


... 372 


F. 


... 197 


V. 


... 112 


Q. 


_ 8 



I 667 
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Hitt also compiled data for telegraphic text (but does not state what kind of messi^es) and 
gives the following table: 

Table 5. — Frequency tdbUJor 10,000 letters of telegraphic Eriglish, as compiled by Hitt 

1 

ALPHABETICALLY ARRANGED 



A 


813 


G.. 


201 


L. . 


.... 392 


Q. 


38 


V 


... 136 


B 


149 


H.. 


386 


M 


.... 273 


R 


677 


W 


... 166 


C 


306 


I-. 


711 


N... .. 


.... 718 


S 


656 


X 


... 51 


D. 


417 


J.. 


42 


0 


.... 844 


T. 


634 


Y. 


... 208 


E 

F 


1 ,319 
205 


le. 


88 

ARRANGED 


P 243 

ACCORDING TO 


U 321 

FREQUENCY 


Z. 


6 


E 


1 ,319 


S-. 


e.'iO 


U 


___ 321 


P 


205 


K 


... 88 


0 


844 


T.. 


634 


C. 306 


G. 


201 


X. 


... 51 


A 


813 


D.. 


417 


M. 


.... 273 


W. 


166 


J 


... 42 


N. 


718 


, L.. 


392 


P. 


.... 243 


B. 


149 


Q. 


... 38 


I 


711 


H.. 


386 


Y 


.... 208 


V. 


136 


Z. 


6 



R. 677 



e. Frequency data applicable purely to English military text were compiled by Hitt/ from 
a study of 10,000 letters taken from orders and reports. The frequencies found by him are given 
in Tables 4 and 5. 

11. Constancy of the standard or normal, nniliteral frequency distribution. — a. The 
relative frequencies disclosed by the statisfJcal study of large volumes of text may be considered 
to be the standard or normal frequencies of the letters of written English. Counts made of 
smaller volumes of text will tend to approximate these normal frequencies, and, within certain 
limits,^ the smaller the volume, the lower will be the degree of approximation to the normal, 
until, in the case of a very short message, the normal proportions may not obtain at all. It is 
advisable that the student fix this fact firmly in mind, for the sooner he realizes the true nature 
of any data relative to the frequency of occurrence of letters in text, the less often will his labors 
toward the solution of specific ciphers be thwarted and retarded by too strict an adherence to 
these generalized principles of frequency. He ^ould constantly bear in mind that such data 
are merely statistical generalizations, that they will be found to hold strictly true only in large 
volumes of text, and that they may not even be approximated in short messages. 

6. Nevertheless the normal frequency distribution or the “normal expectancy" for any 
alphabetic language is, in the last analysis, the best guide to, and the usual basis for, the solution 
of cryptograms of a certain type. It is useful, therefore, to reduce the normal, uniliteral 
frequency distribution to a basis that more or less closely approximates the volmne of text which 
the cryptanalyst most often encoimters in individual cryptograms. As regards length of mes- 
sages, counting only the letters in the body, and excluding address and signatime, a study of the 

‘ Op, dt., pp, 6-7. 

' It is useless to go beyond a certain limit in establishing the normal-frequency distribution for a given 
language. As a striking instance of this fact, witness the frequency study made by an indefatigable German, 
Eaeding, who in 1898 made a count of the letters in about 11,000,000 words, totaling about 62,000,000 letters in 
Gennan text. When reduced to a i>ercentage basis, and when the relative order of frequency was determined, 
the results he obtained differed very little from the results obtained by Easiski, a German cryptographer, from a 
count of only 1,060 letters. See Eaeding, Haeufigkeitswoerterbuch, Stej^itz, 1808; Easiski, Die Cfehetnuchriften 
und die Dechijfrir-Kunst, Berlin, 1863. 
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260 telegrams referred to in paragraph 9 shows that the arithmetical average is 217 letters; 
the statistical mean, or weighted average,' however, is 191 letters. These two results are, 
however, close enough together to warrant the statement that the average length of telegrams 
is approximately 200 letters. The frequencies given in Par. 9/ have therefore been reduced to 
a basis of 200 letters, and the following uniliteral frequency distribution may be taken as showing 
the most typical distribution to be expected in 200 letters of telegraphic English text: 

i 

g g g g g g Ss g 

g^sg ^g ;«gg ggg 

gssgggg^gg ggggg gggg^^^g 

ABCDEFGHIJKLHNOPQRSTUVWXYZ 

Xiaimi 4. 

e. The student should take careful note of the appearance of the distribution ^ shown in 
Fig. 4, for it will be of much asastance to him in tihe early stages of his study. The manner of 
setting down the tallies should be followed by him in making his own distributions, indicating 
every fifth occurrence of a letter by an oblique tally. This procedure almost automatically 
shows the total number of occurrences for each letter, and yet does not destroy the graphical 
appearance of the distribution, especially if care is> taken to use approximately the same amount 
of space for each set of five tallies. Cross-section paper is very useful for th^ purpose. 

d. The word “uniliterol” in the designation “uniliteral frequency distribution” means 
“single letter”, and it is to be inferred that other types of frequency distributions may be encoun- 
tered. For example, a distribution of pairs of letters, constituting a biliteral frequency distri- 
bution, is very often used in the study of certain cryptograms in which it is desired that pairs 
made by combining successive letters be listed. A biliteral distribution of A B C D E F would 
take these pairs: AB, BC, CD, DE, EF. The distribution could be made in the form of a lai^e 
square divided up into 676 c^. When distributions beyond biliteral are required (triliteral, 
quadraliteral, etc.) they can only be made by listing Ihem in some order, for example, alpha- 
betically based on the 1st, 2d, 3d, . . . letter. 

' The arithmeticfil average u obtained by adding each different length and dividing by the number of 
different-length messages; the mean is obtained by multiplying each different leng^th by the number of messages 
of that length, adding all products, and dividing by the total number of messages. 

' The use of the terms "distribution” and “frequency distribution'', instead of "table” and "frequency 
table”, respectively, is coneddered advisable from the point of view of consistency with the usual statistical 
nomenclature. Ti^en data are given in tabular form, with frequencies indicated by numbers, then they may 
properly be said to be set out in the form of a foils. When, however, the same data are distributed in a chart 
which partakes of the nature of a graph, with the data indicated by horizontal or vertical linear extensioiu, or 
by a curve connecting points corresponding to quantitieB, then it is more prox>er to call such a graphic represen- 
tation of the data a distrUniiion. 
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Section IV 

FUNDAMENTAL USES OF THE UNILITERAL FREQUENCY DISTRIBUTION 



Paragraph 

The four facts which can be determined from a study of the uniliteral frequency distribution for a crypto- 



gram 12 

Determining the class to which a cipher belongs 13 

Determining whether a substitution cipher is monoalphabetic or polyalphabetic 14 

Determining whether the cipher ^phabet is a standardi or a mixed cipher alphabet 15 

Determining whether the standard cipher alphabet is direct or reversed 16 



18. The four faots whioh can be determined from a study of the nniliteral frequency dis- 
tribution for a cryptogram, a. The following four facts (to be explained subsequently) can 
usually be determined from an inspection of the uniliteral frequency distribution for a given 
cipher message of average length, composed of letters; 

(1) Whether the cipher belongs to the substitution or the transposition class; 

(2) If to the former, whether it is monoalphabetio or polyalphabetic in character; 

(3) If monoalphabetic, whether the cipher alphabet is a standard cipher alphabet or a mixed 
cipher alphabet; 

(4) If standard, whether it is a direct or reversed standard cipher alphabet. 

b. For immediate purposes the first two of the foregoing determinations are quite important 
and will be discussed in detail in ihe next two subparagraphs; the other two determinations will 
be touched upon very briefiy, leaving their detailed discussion for subsequent sections of the 
text. 

13. Determining the class to whioh a oipher belongs. — a. The determination of the class 
to which a cipher beloi^ is usually a relatively easy matter because of the fundamental difference 
in the nature of transposition and of substitution as cryptographic processes. In a transposition 
cipher the original letters of the plain text have merely been rearranged, without any change 
whatsoever in their identities, that is, in the conventional values they have in the normal alpha- 
bet. Hence, the numbers of vowels (A, E, I, 0, U, Y), high-frequency consonants (D, N, R, S, T), 
medium-frequency consonants (B, C, F, G, H, L, M, P, V, W), and low-frequency consonants (J, K, 
Q, X, Z) are exactly the same in the cryptogram as they are in the plain-text message. Therefore, 
the percentages of vowels, high, me^um, and low-frequency consonants are the same in the 
transposed text as in the equivalent plain text. In a substitution oipher, on the other hand, the 
identities of the original letters of tiie plain text have been changed, that is, the conventional 
values they have in the normal alphabet have been altered. Consequently, if a count is made 
of the various letters present in such a ciyptogram, it wiU be found that the number of vowels, 
high, medium, and low-frequency consonants will usually be quite different in the cryptogram 
from what they are in the original plain-text message. Therefore, the percentages of vowels, 
high, medium, and low-frequency consonants are usually quite different in the substitution text 
from what they are in the equivalent plain text. From these considerations it follows that if in 
a specific cryptogram the percentages of vowels, high, medium, and low-frequency consonants 
are approximately the same as would be expected in normal plain text, the cryptogram probably 
bdongs to the transposition class; if these percentages are quite different from those to be 
expected in normal plain text the cryptogram probably belongs to the substitution class. 

( 18 ) 
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6. In the preceding subparagraph the word ‘^probably” was emphaased by itdhdrang it, 
for there can be no certainty in every case of this determination. Uswitty these percentages in 
a transposition cipher are close to the normal percentages for plain tesrt; usually, m a giibstitia- 
tion cipher, they are far different from the normal percentages for plain text. But occasionally 
a cipher message is encountered which is difficult to classify with a reasonable degree of certainty 
because the message is too short for the general principles of frequency to manifest themselves, 
it is clear that if in actual messages there were no variation whatever from the normal vowel 
and consonant percentages g^ven in Table 3, the determination of the class to which a spedfic 
cryptogram belongs would be an extremely simple matter. But unfortunately there is always 
some variation or deviation from the normal. Intuition suggests that as messages decrease in 
length there may be a greater and greater departure from the normal proportions of vowels, 
high, medium, and low-frequency consonants, until in very short messages the normal propor- 
tions may not hold at all. Similarly, as messages increase in length there may be a leiMer and 
lesser departure from the normal proportions, until in messages totalling a thousand or more 
letters there may be no difference at all between the actual and the theoretical proportions. 
But intuition is not enough, for in dealing with specific messages of the length of those commonly 
encountered in practical work the question sometimes arises as to exactly how much deviation 
(froni the normal proportions) may be allowed for in a cryptogram which shows a considerable 
amount of deviation from the normal and which might still belong to the transposition rather 
than to the substitution class. 

e. Statistical studies have been made on this matter and some graphs have been constructed 
thereon. These are shown in Charts 1-4 in the form of simple curves, the use of which will now 
be explained. Each chart contains two curves marking the lower and upper limits, respectively, 
of the theoretical amount of deviation (from the normal percentages) of vowels or consonants 
which may be allowable in a cipher believed to belong to the transposition class. 

d. In Chart 1, curve Vi marks the lower limit of the theoretical amount of deviation from the 
normal number of vowels to be expected in a message of given length; curve Va marks the upper 
limit of the same thing. Thus, for example, in a message of 100 letters in plain English there 
should be between 33 and 47 vowels (A E I 0 U Y). Likewise, in Chart 2 curves Hi and Ha 
mark the lower and upper limits as r^ards the high-frequency consonants. In a message of 100 
letters there diould be between 28 and 42 high-frequency consonants (D N R S T). In Chart 3, 
curves Mi and Ma mark the lower and upper limits as regards the medium-frequency consonants. 
In a message of 100 letters there should be between 17 and 31 medium-frequency consonants 
(BCFGHLMPVW). Finally, in Chart 4, curves Li and La mark the lower and upper 
limits as regards the low-frequency consonants. In a message of 100 letters there should be 
between 0 and 3 low-frequency consonants (J K Q X Z). In using the charts, therefore, one 
finds the point of intersection of the vertical coordinate corresponding to the length of the 
message, with the horizontal coordinate corresponding to (1) the number of vowels, (2) the 
number of high-frequency consonants, (3) the number of medium-frequency consonants, and 
(4) the number of low-frequency consonants actually counted in the message. If all forir points 
of intersection fall within the area delimited by the respective curves, then the niunber of vowels, 
high, medium, and low-frequency consonants corresponds with the number theoretically expected 
in a normal plain-text message of the same length; since the message tmder investigation is not 
plain text, it follows that the cryptogram may certainly be classified as a transposition cipher. 
On the other hand, if one or more of these points of intersection falls outside the area delimited 
by the respective curves, it follows that the cryptogram is probably a substitution cipher. The 
distance that the point of intersection falls outside the area delimited by these curves is a more or 
less rough measure of the improbability of the cryptogram’s being a transposition cipher. 
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e. Sometimes a cryptogram is encoimtered which is hard to classify with certainty even with 
the foregoing aids, because it has been consciously prepared with a view to making the classifica- 
tion difficult. This can be done either by selecting peculiar words (as in "trick cryptograms”) 
or by employing a cipher alphabet in which letters of approximately similar normal frequencies 
have been interchanged. For example, E may be replaced by 0, T by R, and so on, thus yielding 




■■■■■■■■■■■■■■■■■■■■■a 
BBaBaBaBBBBaaBBBBaaaaa 
BBflaaiBBBBaBiaaBBBBaaB 
iBBlflBflBBBlBaB: 



bbbbbbbbbbbbbbbbbbbi 
B aaiaBBBBBBBBBBBBBBBBB 
BBBBBBBBfliBBBBBBBBBBBfl 
■BBflBBBBBBBBBBBBBflflBBB 



BBBBBBBBBBBBBBflBBBBflflB 

bbbbbbbbbbbbbbbbbbbbbb 

BBBBBBBBflBBBBBBBBBBBBB 



BBBBBBBBBBBBBBflBBBBflflB 

BBBBBBBBBBBBBBBflBBBBBB 



■BBBBflBBBBflBflBBBBBflBBB 



■BBBBflBBBBBflBflBBBflBBBB 

■BBBBiBBBBBBBBflBBBBflBB 

BBBBBiBBBBBBBBBBBBBflflB 

BBBBBBBflBBBBBBflflBBBBBB 



BBBBBBBBBBBBBBBBflflBBBfl 



BBflflBBBflflBBBBflflBBBflBBfl 

BBBflflBBBBBBBBBBBBBBBBB 



.'iBBBflSBBBBBBBHBBBBflBBflfl 

*'flBflBflBBBBBBBBBBBflflBBBB 

BflBBBflBBBBBBBBBBBBBBBB 



BflBflflBBBBBBBBBBBBBBflBfl 

BBBBBBBBBflflflBflflflBBflflBfl 



BBBBBBBBBBBBBBBflflflBBBB 



!■!!■&!!!!!!!!!!!■!!!! 

BBflBBBBBflBBBBBBflBBBBBB 



BflflflBBBflBBBBflflflflflflBBBB 

BBBflBBBflBBBBBBflBBflBBBB 

BBiBBiBBBBBBflBBBflflBflBB 



BBflBBBBBBBBBflflBBBflBBBB 

BBBBflBflflBBBBBBflBBBflflBB 

■BflBBflBBBBBBBBBBBflBBBK 

lii 



flBaaflBBBBaflBBBBBBflB*:;iflB 

BaaiBBiBBaaflaflaBai'^taaB 

BBBBBBflBBBBBBBBBB'^^flBBB 



BflBflBflflflflBflflflflflB'.^aflflB« 

BBBBBaflBBBBBBBirfflBBBBBI 



■BBflBBBBBBBBBJflBflaBBBr. 

flBBBBflBBBBBraBBBaaair:«a 

aiBBBBBBBBr.aBflBBBaB:«aB 

BiBBBBBflBraBBBflBBP^flBBB 



BflBBflBBrBflBBflBf^aaaBBBa 

BBBBBBfiflBflflBa^BBBaBBBB 



BflflB^.iiflBB^flflBBBflflBBBB 

E flBrilifl^^BBaBBBBBBBBB 
BraiiB^flEBBBflflBflflBBBa 



I'jaS^sEfiaBSSSSSSSaSSSaS 

~:aaaBBBflBBBBBBBBBaflBBfl 



BBBBaaaflaBaBBBaaaaBBaBBBBflBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 

BBBBaBaaaBBBBBBaBBBBaBBBBBBBBBBBBBBBBBBBBBBBBBBaBBBBaBBB 

BBBBBBBaBBBBBaBBBaBBBBBBBBBBBBBBBBBBBBBaaBBBBBBBBBBBBBBB 

BaaBBaaBaBBBBiBBBBBBBBBBBBflBBBflBBBBBBBBBBBBBBBBBBBBBBBBB 

BBaBaBflBBBBBBiBBBBBaBBBBBBaBBBBBBBBBBBaBBBBBBBBBBBBBflflBB 

BBBBaBBBBflflBBBBBBBBBBBBBBBBBBBlBBBBBBBBBBBBBBBBBBBBBBBBB 

BBBBaBBBflBBBBBBBBBBBBBaBaBBBBBiaBBBBBBBBBflaBaaaiaBBBBBBfl 

■BBBBBBaBBflBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBaBBBBflBflBflflflB 

■SBSSSSSSBBBBBBBBBBBBSflflBBBBBflSaSBBflSBBBflflBflBflBflSaflSflBSS 

■BSSaBBBBBBBBBBaBBBBBa89aBB"BBBBBBBBBflBBBBBBBBBBrSBBSBBB 

■BBSBBBBBBBBBBBBBBBBBBbbbBBBBbBBbBBBBBBBBBBBBBBBSbBBBBBB 



aBBBBBBBBBBI 

aaflaBaBBaflai 

BBBaaflBBflBBBBBflBBBBBBBBBBflBBaaflaflBBflBBBBaBBBB^flflBflBBBBflfl 

BBBBaBBBflflBBflBaBBBBBBBBBBBBBaBaBiBBBBBBBBBBa^iBBBBflBBiBflfl 

■BaBaBBBaBBBBBBBBBBaBBBaaBBBfliBaiBBBBflaBBBa^BBBBBBBBiflBfl 

■BflBaBBBBBBBBBBBBBBBBBBBBBBBflBBBBBBBBBBBBB^BBBBBBflBBBBBB 

'"""^BBBBBBBBBBBBBBBBBBBBBBBB^BBBBbBBBBBSbbBBBBBBBBBBBBB 

■BBBaBBBaBBBBBBBBBBBBBBflBBBBBBBflBBBBBB^flaBBflBflBBBBBflBflflB 

BBBaaBBBBBBBBBBBifliBBBBBBBBBBBBBBBBBB^BBflBBBBBflBBBBaBBBB 

BaBBaaBaflBBBBBBaiBBBBBaaaaflBBBBaBBBP^aBBBflBBBBBBBBBflBflBfl 

aaBBaBBBBaiBBaBflBBBBBBBBBBBflBBBaBBB'itfflflBBBiBBBBBBBBBBBBBa 



BBBaaBBBBaaBBBflBBBBaBflBBBBafliBaS^flBBadBBBBBBflflBBBBa^BBiB 
BBBBaBBBBBBBBBBBBBBBBBBBBBBaiBB^flBBBBBBBBBBBBBBaBBaBBBiB 
BfliBBBfl^aBBBBBBBBBBBfliBBB^aBBBBBr 



S BBBBBBBBBBBflflBBI 

BBBBBBflBlBaBflBBa:;dBI 

BBBBBBBBBBBflflBBBBBBBBiBBBBBr'aBBaBBBBBBBBBaBBBrMBflBBflBBB 



■BBBflBflflflBBBflBB 


■ 




■ 


■BBB 


BflBflBflBBBB 




■BB 


■ 


■BBfl 


BBBBBfl 


B 


BB'4 


■ 


■ ■■ 












■BBB 


■ 


■BBB 


BBflflBBrflflB 


■ 


■■■ 


■ 


■■■■ 


BflBflBF.B 




■ 





■BB^aat naa 
■B^diBBBflBB 



BflBBBBaBB 



naflBaiBBBi 

laaiBBiBBflBrflBflBBBflBBBBBBBBBBB 

lai 

IbI 



BBBBI 



aBBBB’^BBBBBBBBBaBBflr^BBBflBBBBBBflBBBBBflBflflBBBBBBBflBBB 



BBflBraaBaflflBBBBBBBriaBBBBBBBflBBBBflBBBaBBBflBBBBBBBBBBBBBBB 

BaBnaBBBBaaBBBBBP'.^BBBBBBBBBBBBflBBBBBBBBBBflBBBBflBBBBBBBflB 



!BBBflraiBBflaBBBBB!BgMi 



B^flBBaaBBBflflBa^BBBBBBBBBflBflflBiflBBaBBBBBBflaflBBBBBBBBBBBBB 

^BBBBflflridBBBBBBBBBaBBBBBaBBBBBBBiaflBBaBBBBBBBBBBBBa 

iBaflBB^flflBBBiBBBBaBBaBBBBflaBBBBBBBBBBBBBBBBBBBBBBflB 



BBBBBBaflPflaBaBBBBBBBBBflBBBBBBflBaflBBBBBBBaflaflflBBBflflflBBBBB 

aBflBBaa^flBBBBBBBBBaBBBBflBBBBBflaaaBBBBBBBBBBBBBaBBBBBBBBB 

flBaaaa:i«aflBBaBBBBBflBBaBBBBBBBBBBBBBBBBflBflBBBBBBBBBaBBBBBfl 



Br^aaBBB 

^aaaaaBB 

BBBaaflflBi 



BflflaBBBaBBiflBflBBBBBBBBBBBBBBBaaflBBflBBBBaBBBBBBBB 

BflBBBBB«BBiBiBBBBBBBBBBBBBBBBBBaflBBBBBBBBBBBBBBB 

BBaBBBBBBBiBiiBBBBiBBBBBBaBBBBflBBflBBflBBflBBBBBBBfl 

.BBBBBBBBBBiBBBaBBBaBiBBaBBBBBflflflSflBflflBBBBBBaBBBfl 

BBBBSBBBBSBBBBBSBBBr"""""""""""""" 

aaaBflaiaBBBaaflBBBBBi 
BBBBBiBaaaBBBHBflBBai 



aBBBaBBBflBBBBBBBBBBBaBBBflBBBgBBBBBaBBBBaBI 

BBBBaaaBBaflBBaaBBaaBBaaaaBflr 

BBaBBBaaaaaaBaBBBaBBBBBBBBBi 

BBBBaBBflBaBBBBiBBBBBBBBBBBBBaaaaBBBBBBBaai 

BaaBBBBBBBBBBBiaBBBBiBBBBBBBBBBBBBBBBBBBBI 



BBBBaaBBBBBBaaa 
BBBBBBBBBBBaa 



S BflBBBBBBaBBBBBBBBBBBiBBBBBBBBBBBBBBBBBBBBBBflBBBBBBBBBBaB 
BBBBaBBBiBBBBBBBBBBBBBBBBBBBBBBaBBBBBaBBBBBflBBBBBBflBBBBB 







BBB 

BBa 

ABB 

BBB 

BBB 

BBB 



BBB 



BBB 

BBB 

■BB 



BBB 

BBB 



BBB 

BBB 

BBB 



BBB 

BBB 

BBB 

BBB 



10 20 30 4 0 50 60 70 60 90 100 Ik) I 

Nnmbw otMtan in mtaaie. 

OhaitNo. 1, — Onrmiiivklng (In hnrtr and npptr limits of tbetbtonUenlamoiuitotdaTlatlon bom tba normal nnmbcf of vowels to bsazpaeted 

in messages of varloiia lengths. (Sea Par. Ud.) 

a cryptogram giving external indications of being a transpotition cipher but which is really a 
substitution cipher. If the cryptogram is not too short, a close study will usually disclose what 
has been done, as well as the futility of so simple a subterfuge. 

/. In the majority of cases, in practical work, the detennination of the class to which a 
cipher of average length belongs can be made from a mere inspection of the message, after the 
cryptanalyst has acquired a familiarity with the normal appearance of transposition and of 
substitution ciphers. In the former case, his eyes very speedily note many high-frequency letters, 
such as E, T, N, R, 0, and S, with the absence of low-frequency letters, such as J, K, Q, X, 
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and Z; in the latter case, his eyes just as quickly note the presence of many low-frequency letters, 
and a corresponding absence of the usual high-frequency letters. 

g. Another rather qui<My completed test, in the case of the simpler varieties ciphetB, is 
to look for repetUions oj groups of letters. As will become apparent very soon, recurrences of 
syllables, entire words and short phrases constitute a characteristic of all normal plain text. 
Snce a transposition cipher involves a change in the segptenee of the letters composing a plain- 



I ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 
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!!iiiiBiBiiiiiiiaiiiil 



BBBBBBBBBBBBBBBa^BBaBBBBaBB 



■BBBBBBBBBBBBBP^BBBBBa 



jaBBBBBBaBBaBBBBBBBBBBBBBBBa'^BBI 



:s3s: 

ssh:: 






■BBBBBBBBBBI 



■BBifliiaaiiiiiiiaiBBl 



MBBBBBBBBBBaaBBBBBBa^BBBBBBBBBBBaSBBBB 



laBBBaaBaai 



»BBaBaBBBBBBBBBaaBaBBSSBaSBBBaBBBBBBBSBBSBB"““* 
■BBBBBBBBBBBBBBflBBBBaBBBBaaflBaaaBBBBBB8BaaaB 
■BBBBBBBBBBBBBBaBBiiBBBBBBaaBBaaaaBaaaaBBBaa 

■ aaaBBBBaBBBBBBBBBaBBBBJJ|JB^aBB»BJIj|||||jBBBa^ 



■ aBBaBBBBaBBr^aBBaaBBBBariaBBBaBaBBBBBaBBBaBBBBBBaB 

■ BBBBBBBBr 8888888888*^8888888888888888888 888888888 






■BBraBBBBBaraBBBBBBBBBBBBBBBBBBBBBBBBBI 



iBBBBr 888888^^8888888888888881 
■ BBr.aBBBBP^aBBaaBBBBBBBBBBBBBBBBB 
-<:8 88888888888888888888 888 






BSSi 



IBBBBBBBBBBBBBBBBBB 



■8888888888881 



■888888888888888888888 



s^Sas: 



10 20 30 40 so 60 70 60 00 



888881 

B8B8BI 



BBBBBBBBBBI 



BBB88BBBBBBBBBBBI 



iSB! 



■BBBBBBBBBBI 



888888888888888888888 
88888888888888888 



■88888881 









Number of letten In meagage. 



OhmtNo. 3.— Coma marking the lower end upper limlti of the tbaorettoelaiiioant of deviation bom the normal number of Ugh-baqaaney eonae 

nanta to be expected in maaaagaa of varionalengtba. (SeaFar.lU.) 

text message, such recurrences are broken up so that the cipher text no longer will show repetitions 
of more or less lengthy sequences of letters. But if a cipher message does show many repetitions 
and these are of several letters in length, say over four or five, the conclusion is at once warranted 
that the cryptogram is most probably a substitution and not a transposition cipher. However, 
for the beginner in ciyptanalyris, it will be advisable to make the uniliteral frequency distribution, 
and note the frequencies of the vowels, the high, medium, and low-frequency consonants. Then, 
referring to Charts 1 to 4, he should carefully note whether or not the observed frequencies for 
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■■■■■■■■■■mamt 

■■■■■■■■■■■■aaaaaaaaai 




■BBBBBBaBBBBaBaBaaBaBBBBiBBBBaiaaBiBBBBaBBBBiBBBBBBaBBliBiBiiiBBBBBBBaEBBaBaBBBBaBBBBBBBfi 

aBBBaBBBBBBaBBBBBBBa ® = 

BBBBaBBBflaaaBBBBflaBI 

BBBBBBBBiflBBBBaBBBBI ^ ^ . 

BBBBaBBBBBBaaaBiBBBflBBiBaaflBBBBaBaBBBBBBriBBBBBBBBiBBBiiiBiiBiB 

iaiaaaaBBBBBBBBBBBBBBBBBBBBBBBBBBBBBiBBiBBiBBiBBBBBBBBBBBaaBBBBBBBBBBi 

^BBaBBBBBfaBBBBBBBaBBBiiBBBEBiaBiBiaBBaBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 

.. ^BaaaBBEaBBBBBBBBEaaBBBBaBBiBBBBaaBBBBBBBBBBBBBBBBBaBBBBBBBBBBBBBBaB 

BBBBBflBB8BaBaBBB8BBiaBBBflaBBBBBaBBBBBBgBBBBBgBBBBlBB|BBBiBflaBBBBBBBaaBBBBBBBBBBaBliBBaBBBB 

iBBBBBBBBBaBaBBBBBBiaBBBBiBaBBEiiiaBBBBiBBBBBBBBBBBBBiBBiBBBaiiBBBaBBBa 

^BBBBBBBBBBBaBBBaBiBgBBBBBBBBBEaaEaBiaBBaBBBBBBBBBBBBiBBBBBBBBBaBBaB 

JBBaaBaBBBBBBBBBBBiBEBaBBBaBBBBBBBBBIBBBBaBBBBBBBBBaaBBBBBBBBBBBBBBB 

BBBaBBBBBBBaBBBBBBBBBBBBaaBBBBBBBBaaaBBBBlBBgBBBBBBaaBBiBBiBBBBBBBBBBBBBBBBBBBCBBiBBBBBaBB 
BflBBaBBBBBBBBBaBBaaflBBBBaBaBBBBBBiBBBBBaBiBEEBflaBBBBBBBiBBBBBBBBBBBBBBBBBgBBCBBBBBBBBBBBBB 
BBBBBBBBaBaBaBaBaaBBBBBBiBBBBBBaBEBBBBBBBBBBBBaBBBBBaBBBBBBBBBBBBBBBBBBBBB^BBBBBBaflBaaBBBB 
BaaiaBBBBBBBBBBBBBBBBaBBiaBBaiBBBBBBBBBaBBBBBBBBBBBBBBBBaBBBBBBBBBaBBBBB^iMBBBBBaBBBBBiBBBB 
BaBiBBBBBBBBaaBBBBBBaBBBBaBBBiBaBBBBBBBaBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB^^iBBBBBBEBBBBBaiBBBB 
aaaBBBBBBBBaBBBBBBaaBBBBBaBBBBBBBBBBBBBBBBBBBBaBBaaaBBaBBaaBBBaaBBaBI!kBBaaBBBBBBBBBBBBBBBB 



BBBBaBBBBaBBaBaBBaaiBBBBBBBBBaBa 

aBaiaBBBBBaBBBBBBaaaflBBBBBBBBBBB 

BBBiBBBBBBBBaBBBBBBaBaBBBaaaaBBBBBl 



BBBBBiBBBaaaBaaBBaaaBaaaBaaBBBBaBBBBBBBaBBBaBBaBBaBBBBBBBBBBBBBBBBK^BBBBBBBBBBaBBBBBaiBagB 

BBBaBEBBaBBBaBBBaaBaBaBBBaBBBBBiBBBaBBBaBEBBBBBBBBBBBBaBBBaaaBBBP*;inrBBaBaBBBBBBaaBBiaaBBBB 

aaBBaiiiiggiaagg|aBiigiBBBiiaBaiiiBiiaBBBBaBBBgBgBBaai9BBBiiiBa^Bai>j.’:9BaBaBiiiiiiBBBiiiBii 

■aaai 
■ ■891 
■■■ii 
■■ail 



■BHB88aaaBBaaBaBBBBflaBaaaBaBBBaBBBBBBBBaaBBBBBBBaaaBBaHBBBBBB5aBaBBB»BBaaBaBaBBBBaaHBaMBi 

■■■■BaaBaaBBaEaBaaaaaBaBBBaBBBBBBBBBBBBBaBBBBaaBaaaaBaaaaaBSBBBBBBBaaiBBBaBBaBaBBiBBBaBBBB 

■■■■■aaaBaaaaaaaBaaaaaBBBaBaBBBBBBBBBaaaBaaBBaaaaaaaB^BBBBaBBBBBaaaBBBBBBBaBaaBaBBBaBB 

■■■BBBaaaaBBaaBaaaBaBBBBBaBBBBBBBBaBBBBBaaBBBBaaaBB^BBBBBaBBaBBBBBBBaaBBBBBaBBBBaiBBBa 

^■■BBBBaaBaBBBBBBBBaBaBBBBaBBBBBBaBBBBBBBBBaBBBBBB^BBBBBBBaBBBBaaBaBBBBBBBBBBaaaaaaaiaB 

BBBaaBaaaaBBaBaaaBaaaaaaaaaaBaaaBaBaaBBaaaaaaaaaaaap*'itfaBBaB9BaBaaaB8BBaaBaaaaaBBBaBaaavvaaa 

■■■■■■■■■■■■■■■■■■■■■■■■BBaBBBBBBBBaBBBBBaBBBBaBBr;«aBaaBaaEHBaaBBBBaBaBBaBBBaaaBaaaaBaHaaa 

■BBaaaaaaaaaBaaaaBaaaaaaBBaaBBBBBBBBBBBBBBaBBBBBikBBBBmBBaBaBgBBBBaBBBBaaaaBBBCiriBBBiBaaaaaa 

PiBBBBBBMBBBBBBiBBBBiBBBBBBBBPSBBBBBBBaaBaBBB 

■■■■■■BjiBaBBBBaBBBBaaaBBaaB*!^aaaaBBaBBaaBaaaB 

iBaaB^BaaaBaBaaBBaBBBBBBBBBaBBBBBSBi^'iBaaBBaaaaBaaaia 
LUBBBaBBaaaBBBiiB 

BBBaaBB 



■■■■■■■■BBaaaaaaBaaBaaaBaBBaaBBBBBBBBBBiBaaBBP 

^Ebbbb^ * 



■■■■■■■■■■■BBaBaBaaaaaaaaaaaBaaBBBBBBBBai 

BBBBBaaaaBBBaaaaaBaBaaaBBaHBBBBaaBaaiBBa' 

■■■■■■■■■■■■■■■BaBaaaaBBBBBBBBBBBBBBiaBa 

■■■■■■■■■■BBBaBaaaaaBBBBBBBaBBBBBBBBBBP^ 



l■■^^■■■■■■■■■B■■■■BB■■■■■■■■■^2B■■■k: 



BaaBaaaBaaaBBaaaaBaBBaaaBBBBBBBBaBBBP'.. 

■■■■■■■aaBaaaaaBBBaaBaaBBBaBaBBBBaB*sBB 



■■BaBBBBaaaBBBBaaBaBBBBBBBaaBP' 

■aBBBaBaBBBBBaaBaBBaaBaBBBi 

■■■■■■■BBBBaaaaaBBaaaaaaBBi 
■■■BBBaBaaBBBBBBai 



ip^iBBBaiBBBaBBBBaaaaaaBBaaBBS'BBaBBBBaBBBBaBBaaBBi 
■■■■■■■■■■■■■■■■■■BiBBBBPaBaBBaBaaBaBBBaaBaaBBaL- .. 

BBaBBBBBBBaaBBBBBBaigi'S^BBBBBBaBaBBBBBBBBBaaaBaaBBaB 

■■■BPaBaaBRBBBBBaaBBaaBBBBBBBBBBB^^BBBBBBBBBBBaBBBBBBfaBaPSBaBBBaBBBBaBBBBBBaaaBiaBBBBBBBB 

flaaaa8iaiaaBBaaaaBaaBBaaaBBBBBBP:«BBBBBaBBaBaaBBBBBBaaaa^.rfBBBaaaaaaBBaBBaaaaaaaaaiBBBBBBaaa 

l■■BP^B■BB■■B■■BBBB■■g■■BBBPSB■■B■■BBBBBB■■■■■■BBBBBB■■■■■BBI 
BB<!:gBBBBBiBBBaBBBBBaiBBB^^BaBBBBflBBBBBaBaBBBBBBBBaaBaaBBBBBI 
v^^BiBBBBBBBBBiBaBBBaBPSPBaBBBaBBaBBBBaBBBBBBBaBaBBaBBBBBBBai 



IBBBBB 

BBBBa 

■■■■a 



■aaaaBaaaBaaBaaaap^BaaaaBBaBBaBBBBBPSBBaaBBBBaaaBaBaBBBBaaaBaaBaaaBflaaaBaBBBBBaBBaaaaai. 

BaaBaBaaaaaaBaaa^BBaaaaaaBaaBaBBBP^BBaaBaaaaBaBaBBBBBBBaaBaBaaBaaaBBBaBBBBBBaaaBBaBBaaaaaB 
aaaaaBaaaaaaaa»’.^aaBaaaaaaBaBBBB^iiBBBBBBBBaaaBBBBBBiBBaBBBBBBBaBBaBBBaaBBBaBBaaBBaaBaaaBBaa 
aaaaaBaBaaaaP^aaBBaaaaaaaBBBPSBBBBBflaBBBBaBBBBBBBaBBaaBBBBagaaaaBBaaaaaaaaaaaBBaaBBaBaaaaa 
■■BaaBaBBaB^BBaaaBaBaBBaaB^^BBBBBBBBBBBaBBBBBBaBaBBBBBBBBBaiaBBBBBBaaaaaaBBaBBaBBaaaaBBBBB 
aaaBaBBBar^BaaBaBBBBaaaPSBaBBBBBBBBBBBBBBaBBaaaBaaaaBBBBBaBBBBBaaBaBBBaaBBaaaaaBaaBBBBBaaa 
aaaaaaip^aBaaaaBaBaaa»«aaBBBBBaBBBBBaBaaaaaaaaaaBaaBBaBBBaBBaaBBaBBBBBaBBBBBaBaaBBBBBaaaBB 
aaaaaa>;BaaiaaBaaBaPSaiBaaBaaBBBBBBBBaBBaBaaaBBBBaaaaBaBaBaBBBaBBBBBBaaaBBBBBaBaBBBaBBBaaBB 
aBBap^aBaaBBBaap^^BaaaaaBaaBBBaaBBBiBBaBBBBBBaBBBaBBBaBBBBBBBaBBBBBBaBBBaBBaaaBBBaBBBBBBBa 
paBraaaaaBBaa^^aaaBaiBaaBBBBBiaiBaaiBBaaaBBBBaaBaaBgBBBBBBBBaBBBBBBBBBaaflaBaaBaBBBaBBaaBaB 

aar.aaaaaapvsBaaBaBaaiBaaBBBBBiBiBaBBBBBBaaBaBBaBBBaflBBBaBiaBaBBBBaBBBBBaiBBBBBaBaBaaaa 

Bf:BBaBBvsaaBaBBaaBBBaBBBaBaaBBBBBBBBaBBBBBBaBaBBBBaBBiBaaiaBBBaaBBBaaBaBBBBBaBaaaBBaBB- 

^.aB p ^aaa aaaaaBaaaBaaaaBBaBaBaaBaBBBaaapaaaBBBaaaaaaBBaBaBaBBBaaaaaBaaaBaaaaBaaaaaaappaBaBa 
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Namberonettanlaineeaage- 

CbabiNo. 3.— CnmsmarUiiKthelowwand upper Umlta of the tbeoieticBl amount of deviation from the normal number of medium-frequency 

oonaonanta to be expected In meancBs of varioui lengths. (See Par. 13d.) 

'of letters, figures and other symbols, it is inunediately apparent that the cryptogram is a sub- 
stitution cipher. 

i. Finally, it should be mentioned that there are certain kinds of cryptograms whose class 
cannot be determined by the method set forth in subparagraphs b, e, d above. These exceptions 
will be discussed in a subsequent section of this text.^ 

14. Determining whether a substitution cipher is monoalphabetio or polyalphabetic.— a. It 
will be remembered that a monoalphabetic substitution cipher is one in which a single cipher 
alphabet is employed throughout the whole message, that is, a given plain-text letter is invariably 

‘Pap. 47. 
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represented throughout the message by one and the same lettw in the cipher text; On the other 
hand, a polyalphabetic substitution cipher is one in which tiro or more cipher alphabets are 
employed with^ the same message; that is, a given plain-text letter may be represented by two or 
more different letters in the cipher text, according to some rule governing the selection of the 
equivalent to be used in each case. From this it follows that a angle cipher letter may represent 
two or more different plain-text letters. 

6. It is easy to see why and how the appearance of the uniliteral frequency distribution for 
a substitution cipher may be used to determine whether tlie cryptogram is monoalphabetic or 
polyalphabetic in character. The normal distribution presents marked crests and troughs by 




Chabt No. 4.-~CDTves marking the lo-vrer and npper limits of the theoretical amonnt ot deviation from the normal nnmber of low-treqnenoy oonao- 

nants to be expected In messages of various lengths. (SeePar.Ud.) 

virtue of two circumstances. First, the elementary sounds which the symbols represent are 
used with greatly varying frequencies, it being one of the striking characteristics of every alpha- 
betic language that its elementary sounds are used with greatly varying frequencies.* In the 
second place, except for orthographic aberrations peculiar to certain languages (conspicuously, 
English and French), each such sound is represented by the same symbol. It follows, therefore, 
that since in a monoalphabetic substitution cipher each different plain-text letter (= elementary 
sound) is represented by one and only one cipher letter (=elementary symbol), the uniliteral 
frequency distribution for such a cipher message must also exhibit the inegular crest and trough 
appearance of the normal distribution, but with only this important modification — the absolvte 

’ The student who is interested in this phase of the subject may find the following reference of value: Zipf' 
G. K., Selected Studies of the Principle of Rdedive Frequency in Language, Cambridge, Mass., 1032. 




REF ID:A64646 



24 

positions oj the crests and troughs wiU not he the same as in the normal. That is, the letters accom- 
panying the crests and the troughs in the distribution for the cryptogram will be different from 
those accompanying the crests and the troughs in the normal distribution. But the marked 
irr^iularity of the distribution, the presence of accentuated crests and troughs, is in itself an 
indication that each symbol or cipher letter always represents the same plain-text letter in that 
cryptogram. Hence the general rule: A marked crest and trough appearance in the uniliteral 
frequency distribution for a gveen cryptogram indicates that a single cipher alphabet is involved and 
constitutes one of the tests for a monoolphabetie substitution cipher. 

c. On the o&er hand, suppose that in a cryptogram each cipher letter represents several 
different plain-text letters. Some of them are of hi g h frequency, others of low frequency. The 
net result of such a situation, so far as the uniliteral frequency distribution for the cryptogram 
is concerned, is to prevent the appearance of any marked crests and troughs and to tend to reduce 
the dements of the distribution to a more or less common level. This imparts a “flattened 
out” appearance to the distribution. For example, in a certain cryptogram of polyalphabetic 
construction, K«=E^, Gp, and J„; R,=Ap, I^, and Bp; Xp=0p, Lp, and Fp. The frequencies of 
K„ Ra, and X« will be approximately equal because the smnmations of the frequencies of the several 
plain-text letters each of these cipher letters represents at different tunes will be about equal. 
If this same phenomenon were true of all the letters of the cryptogram, it is clear that the 
frequencies of the 26 letters, when shown by means of the ordinary uniliteral frequency distribu- 
tion, would eIiow no striking differences and the distribution would have the flat appearance of 
a typical polyalphabetic substitution dpher. Hence, the general rule: The absence of marked 
cr«ds and troughs in the uniliteral frequency distribution indicates that two or more cipher alphabets 
are involved. The flattened-out appearance of the distribution constitutes one of the tests for a pdy- 
alphabetie substUuHon cipher, 

d. The foregoing test based upon die appearance of the frequency distribution constitutes 
only one of several means of determining whether a substitution cipher is monoalphabetic or 
polyalphabetic in composition. It can be employed in cases yielding frequency distributions 
from which definite conclusions can be drawn with more or less certainty by mere ocular exami- 
nation. In those cases in which the frequency distributions contain insufficient data to permit 
drawing definite condutions by such examination, certain statistical tests can be applied. These 
will be discussed in a subsequent text. 

e. At this point, however, one additional test will be given because of its simplicity of appli- 
cation. It may be employed in testing messages up to 200 letters in length, it being assumed that 
in messages of greater length ocular examination of the frequency distribution offers little or no 
difficulty. This test concerns the number of blanks in the frequency distribution, that is, the 
number of letters of the alphabet which are entirely absent from the message. It has been 
found from statistical studies that rather definite “laws” govern the theoretically expected num- 
ber of blanks in normal plain-text messages and in frequency distributions for cryptograms of 
different natures and of various sizes. The results of certain of these studies have been embodied 
in Chart 5. 

f. This chart contains two curves. The one labeled P applies to the average number of 
blanios theoretically expected in frequency distributions based upon normal plain-text messages 
of the indicated lengths. The other curve, labeled B, applies to the average number of blanks 
theoretically expected in frequency distributions based upon perfectly random assortments of 
letters; that is, assortments such as would be found by random selection of letters out of a hat 
containing thousands of letters, all of the 26 letters of the alphabet being present in equal pro- 
portions, each letter being replaced after a record of its selection has been made. Such random 
assortments correspond to polyalphabetic dpher messages in which the number of dpher alpha- 
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bets is so large that if uniliteral frequency distributions are made of the letters, tiie disixibutionis 
are practically identical with those which are obtained by random selections of letters out of a hat. 

p. In using this chart, one finds the point of intersection of the vertical coordinate corre- 
sponding to the length of the message, with the horizontal coordinate corresponding to the 
observed number of blanks in the distribution for the meBaage. If this point of intersection falls 
closer to curve P than it does to curve 22, the number of blanks in the message approximates or 
corresponds more closely to the number theoretically expected in a plain-text message than it 
does to a random (cipher-text) m^age of the same lengtli; therefore, this is evidence that the 
cryptogram is monoalphabetic. Conversely, if this point of intersection fall* closer to curve 22 



I ■»■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 
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!■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 

!■«■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 



aaaaaaBaBaaaaaBaaaaaaBBaaaaaBai 

JBBBBBBBBBBBBiBBBBBBBBaaaBBBBBaBB 
iBMBBBBBflBBiBBBiaBBBBBiBBBBBBBBBBBB 
iBi.BBBBBaaBaaaaaBaaBaaBBBBBBBBBBBBB 
iBftIBBBiBBBBBBBBBBBBBBBBBBBBBBBBaaB 
iBB\1BBBBBBBBBBBBiBBBBBBBBBaaBBBBBBB 
^BaiVBBBBaBBBBBBBiBBBBBBiBBBBBBBBaBB 
^Bah<aBBaBaBBBaBBBBBBBBBaBBBaBBBBaaB 
^BBavlBBBBBBBBBBaBBBBBBBBBBBBaaBBBBB 
iaaBiVBBBBBBBBBBBBBiBBBBBBBBaBaBBBBB 
VBBBliiaBBBBBBBBBaaiBBBaBaBBBBaaaBBB 

:sv::::::;:::sssss:;ss::s:s:s: 

iaBh1i.\BBBBBBBBBBBBBBaBBBBBBBBBaai 
laaaaaBBBBBBBBBBBBBBBBBBBBBBBBB 
IBBB.IkiaBBBBaBBBBBBBBBBBBBBBBBBB. 



■BBBBBh'Bk'BBBaa 
■BBBBBBaak^BBBBHHHaHi 
■BBBBBBLiaaaiBBBBBBBI 
^IBBBBBBaBBkaaaaBBBB' 
iaBBBaBBB^^BB^aaaaaBB' 
'■BBaaBBB&'BBBIk'aBBBBBl 
■BBBBBBBB.taBBa:^aBBBI 
IBBBBBBBBh'BBBBB^BBB! 
■aaBBBBBBB.iaaBBBK^BI 



BBBBBBBBBBB 
BBBBBBBBBBB 
BBBaBBBBa 
BaBBBBBBa 
aaaBBBBBi 
BBBaBBaaa 
BBBBBBBBB 
BBBaBBBBB 



■aBBBBBBBaBaaaBi 
■aBBEBBflBBBBt.''BBI 
■aBBBBaBaBBflB.'^BI 
_ laBBaaBBaB BBaaaaaBi 
iBBBBEiBBBBaBBBKaBBI 
~ — laBBEBBaBBBBBBk^aBI 



;:;;;s;ssss:ssi 

IBBa^^BBBBBBBBB 
iBBBaaMkSaaBBBBa 
iBBBflaisz^aBa 
iBBBBBBBBBaSS 
IBaaBBBBBaBBB 



laaBCBBBBBaaa 

IBaBB^aBBBBBB 

■BBBBBB^IBBBBB 



BBaaBBaaBaaaaaBBaaaBaaBBaBaaaBBBB 

BaBBaBBBBBBaaBBaBaaaaaBBBBBaBaaaB 

BBaaBaaaBBBBBBBaBBBBBaaaaBaBBaBaa 

BaaBaaaaaaaaaaaBaBaaaaaaaBBaaBBBa 



BBaaaBBBBBBBBaaaBaBBBBBBBBBBBBBBB 

IBBBBBBBBBaBB 

wBBBaBBaaaaB 
IBBBBBBBBaBaa 
iBaBaBBBaaBBB 



BBaaaBBaBBBBBaaaaBBBai 
BBaBBBBBBBBBaaaBBBBBBI 
BBaBBBflBBBBBaBBaaaaBBI 
BBBBBaaaBBBBBBB ' 



BBBBBBBBBBBBBBaBaBaaaBBBBBBaaBBBB 

BBaBBBBBBBBBBBaBBBBBBBBBBBBBBBBBB 

BBaBBaBBBBBBBaBBBBaaaBaaBBaBBBBBB 

BBBBBBBBBBBBBaBBBaaBBBBaBBaaBaBBa 

B BBBBBBBBBBBBBBBBBBBBBaBBBBBaaBBB 
BaaBaaBBBBBBaBBaaBBBBBBBBBBBBBBB 
BaaBBBBBBBBBBaaaBBBBBBBBBaBBBBaBB 
BBaBBBBBBBBaaBBBBBBBBBBBBBBBBBaBB 
BBaBBBBBBaBBBBBBBBBBaBBBBaBBBaaBa 
BBaaaBaaaBBaBBBBBBBBBBBBBBaBBBBBB 



IBBBBBBBBBBaai 

■BBBBBBBBBBBBB 



BBBBBBBJ 
■BBBBiil 
■BBBBiil 
■aaBBBBi 
■BBBBBBI 
■BBBI 

:bs:i 

■aaaBBi 
■BBBBai 

IBBBBBaBBBBBBBI 
■BBBBBBBBBBBBBI 
■BBBBBBBBBflBaa: 
BBaiaBBBBBfBaBi 
■BBBBlBBBBiaaai 
■BBBaiaBBBBBiB: 



iBBaaaBBBi 
iBaBaBaaBBBaai 
aaBBaaflBaBBBi 
aBaaaiaBBBaai 
iBBiaBBaaBBaii 
~nBaaaaBBr 
iBBBir — 



aaBBBBBaaaaaBaaaaaB 

BaBBaBaaaaaaBB 

aaaaaaBaaaBvaa 
aBaBaBBBaaaaaa. 
BBBaaaaaBBitBBiB 
uumummmmmmmtmrnii 



aaa 

BBB 

BBB 

BBB 

BBB 
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BBBBBBB 
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■ BBB 
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■ BBB 
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BBB 


Bl 
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■ BBB 


BBB 
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! 







30 40 
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Chabt no. Curves stiowlzig tin avenge number of bbrnks theoretically expeeted In distributions for plain text (P) and for random text (B) for 

meaeages Id various lengths. (SeePar. Itf.) 

than to curve P, the number of blanks in the message approximates or corresponds more closely 
to the number theoretically expected in a random text than it does to a plain-text message of the 
same length; therefore, this is evidence that the cryptogram is polyalphabetic. 

h. Practical examples of the use of this chart will be given in some of the illustrative messages 
to follow. 

15. Determining whether the cipher alphabet is a standard, or a mixed cipher alphabet. — 
a. Assuming that the uniliteral frequency distribution for a given cryptogram has been made, and 
that it shows clearly that the cryptogram is a substitution dpher and is monoalphabetic in 
character, a consideration of the nature of standard cipher alphabets * almost makes it obvious 
how an inspection of the distribution will disclose whether the cipher alphabet involved is a 
standard cipher alphabet or a mixed cipher alphabet. If the crests and troughs of the distribu- 

* See Sec. VIII, Elementary Military Cryptography, 
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tion occupy positions which correspond to the reUaive positions they occupy in the normal 
frequency distribution, then the cipher alphabet is a standard cipher alphabet. If this is not the 
case, then it is highly probable that the cryptogram has been prepared by the use of a mixed 
cipher alphabet. 

b. A mechanical test may be applied in doubtful cases arising from lack of material available 
for study. Just what this test involves, and an illustration of its application will be given in the 
next secticm, uging specific examples. 

16. Determing whether the standard cipher alphabet is direct or reversed. — ^Assuming 
that tbe frequency distribution for a given cr 3 rptogram shows clearly that a standard cipher 
alphabet is involved, the determination as to whether the alphabet is direct or reversed can also 
be made by inspection, since the difference between the two is merely a matter of the direction 
in which the sequence of crests and troughs progresses — to the right, as in normal reading or 
writing, or the left. In a direct cipher alphabet the direction in which the crests and troughs 
of the distribution should be read is the normal direction, from left to right; in a reversed cipher 
alphabet this direction is reversed, from right to left. 
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Section V 

UNILITEBAL SUBSTITUTION WITH STANDARD CIPHER ALPHABETS ^ 

FtncMph- 

Principles of solution by constrvLotion and analysis of the uniliteral frequency distribution .... ,. 17 

Theoretical example of solution .... j... 18 

Practical example of solution by the frequency method 19 

Solution by completing the plain-component sequence.. i ... 20 

Special remarks on the method of solution by completing the plain-component sequence ^ ^ i 21 

Value of mechanical solution as a short cut • 22 

17. Principles of solution by construction and analysis of the nniUteral freqtienoy (Ustri-' 
bution:-^. Standard cipher alphabets are of two sorts, direct and reversed. The analysis of 
monoalphabetic cryptograms prepared by their use follows almost directly fi^ a consideration of 
tlie nature of such alphabets. Since the dpher component of a standard cipher alphabet condsts 
either of the normal sequence merdy displaced 1, 2, 3i . . • intervals from the normal point <9f- 
coihoidence, or of the normal sequence proceeding in a reversed-normal direction, it is obvious 
that the uhiliteral frequent^ distribution for a cryptogram prepared by means of such a dpher 
alphalret employed monoalphabetically will diow crests and troughs whose ref^iee positions 
and -frequendes will be exactly the same as in the uniliteral frequency distribution foir the idaijli 
text of that cryptogram. The only thing that has happened is that the Whole set of crests and 
troughs of the distribution has been displaced to the right or left of the position it occupies in the 
distribution for the plain text; or else the successive elements of the whole set progress in the 
opposite direction. Hence, it follows that the correct determination of the plain-tejct value of the 
letter marking anjf crest or trough of the uniliteral frequency distribution will result at one 
stroke in the correct determination of the plain-text values of all the remaining 25 letters respec- 
tively marking the otl^er crests and troughs in that distribution. Thus, having determined the 
value of a dngle element of the cipher component of the cipher alphabet, the values of all the 
remaining letters of the cipher component are automatically solved at one stroke. In more 
simple language, the correct determination of the value of a dngle letter of the cipher text 
automatically gives the values of the other 25 letters of the cipher text. The problem thus 
resolves itself into a matter of selecting that point of attack which will most quickly or most 
easily lead to the determination of the value of one cipher letter. The single word identifieoHon 
will hereafter be used for the phrase “determination of the value ot a cipher letter”; to identify a 
cipher letter is to find its plain-text value. 

h. It is obvious that the easiest point of attack is to assume that the letter marking the crest 
of greatest frequency in the frequency distribution for the cryptogram represents Ep. Proceeding 
from this initial point, the identifications of the remaining cipher lettelrs marking tiie other crests 
and troughs are tentatively made on the bads that the letters of the cipher compontot procted 
in accordance with the normal alphabetic sequence, either direct or reversed. If the actual 
frequency of each letter marking a crest or a trough approximates to a fairly dose degree the 
normal theoretical frequency of the assumed plain-text equivalent, then the initial iden^catioh 
0o=Ep may be assumed to be correct and therefore the derived identifications of the other cipher 
letters may be assumed to be correct. If the original starting point for assignment of plain-text 
values is not correct, or if the direction of “reading” the succesdve crests and troughs of the 

( 27 ) 
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distribution is not correct, then the frequencies of the other 25 cipher letters will not correspond 
to or even approximate the normal theoretical frequencies of their hypothetical plain-text equiva- 
lents on the basis of the initial identification. A new initial point, that is, a different cipher 
equivalent must then be selected to represent E,; or else the direction of “reading” the crests and 
troughs must be reversed. This procedure, that is, the attempt to make the actual frequency 
relations exhibited by tinihteral frequency distribution for a given cryptogram conform to the 
theoretical frequency relations of the normal frequency distribution in an effort to solve the 
cryptogram, is referred to technically as “fitting the actual uniliteral frequency distribution for a 
cryptogram to the thoretical uniliteral frequency distribution for normal plain text”, or, more 
briefly, as “fitting the frequency distribution for the cryptogram to the normal frequency distribution", 
or, B<^ more briefly, “fitting the distribution to the normal." In statistical work the exprestion 
commonly employed in connection with this process of fitting an actual distribution to a the- 
oretical one is “testing the goodness of fit.” The goodness of fit may be stated in various ways, 
mathematical in character. 

e. In fitting the actual distribution to the normal, it is necessary to regard the cipher com- 
ponent (that is, the letters A . . . Z marking the successive crests and troughs of the distribution) 
as partaking of the nature of a wheel or sequence closing in upon itself, so that no matter with 
what crest or trough one starts, the spatial and frequency relations of the crests and troughs are 
constant. This maniMr of regarding the cipher component as being cyclic in nature is valid 
iecatwe it is obvious that the relatwe positions andfrequencies of the crests and troughsofanyunUiteral- 
frequeney distribution must remain the same regardless of uihat letter is employed as the initial point 
of the distribution. Fig. 6 gives a clear picture of what is meant in this connection, as applied to 
the normal frequency distribution. 



i 




Ttaxnat >. 

d. In the third sentence of subparagraph b, the phrase “assumed to be correct” was ad- 
visedly employed in descrilnng the results of the attempt to fit the distribution to the normal, 
because the final test of the goodness of fit in this connection (that is, of the correctness of the 
asMgnment of values to the crests and troughs of the distribution) is whether the consistent 
substitution of the plain-text values of the cipher characters in the cryptogram will yield intelli- 
^ble plain text. If this is not the case, then no matter how dose the approximation between 
actual and theoretical frequendes is, no matter how well the actual frequency distribution fits 
the normal, the only possible inferences are that (1) dther the dosen^ of the fit is a pure coin- 
ddence in this case, and that another equally good fit may be obtained from the same data, or 
else (2) the crjrptogram involves something more than simple monoalphabetic substitution by 
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means of a single standard dpher alphabet. For example, suppose a transposition has been 
applied in addition to the substitution. Then, although an excellent coiiespondence be^tw^een 
the uniliteral frequency distribution and the normal frequency distribution has been obtained, 
the substitution of the cipher letters by their assumed equivalents will still not yield plain text. 
However, aside from such cases of double encipherment, instances in which the uniliteral fre- 
quency distribution may be easily fitted to the normal frequency distribution and in which at 
the same time an attempted simple substitution fails to yield intelligible text are rare. It may be 
said that, in practical operations whenever the uniJiteral frequency distribution can be made to 
fit the normal frequency distribution, substitution of valu^ will result in solution; and, as a 
corollary, whenever the uniliteral frequency distribution Cannot be made to fit the normal 
frequency distribution, the cryptogram does not represent a case of simple, monoalphabetic 
substitution by means of a standard alphabet. 

18. Theoretical example of solution. — a. The foregoing principles will become clearer by 
noting the cryptographing and solution of a theoretical example. The following message is to be 
cryptographed. 

HOSTILE FORCE ESTIUATED AT ONE REGIUENT INFANTRY AND TWO PLATOONS CAVALRY 
MOVING SOUTH ON QUINNIMONT PIKE STOP HEAD OF COLUMN NEARING ROAD JUNCTION SEVEN 
THREE SEVEN COMMA EAST OF GREENACRE SCHOOL FIRED UPON BY OUR PATROLS STOP 
HAVE DESTROYED BRIDGE OVER INDIAN CREEK . 

b. First, solely for purposes of demonstrating certain principles, the uniliteral frequency dis- 
tribution for this message is presented in Figure 6. 

i ^ g 

g g g ^ 

g g g gg g^g 

g sgg ^g ...s-gg.^ ggg.^^ 

g :S! g g g g g g g ^ S: g g g g g ^ g g g g g ^ § 

ABCDEFGHIJKLMNOPQRSTUVWXYZ 

XlOTIM t, 

e. Now let the foregoing message be cryptographed monoalphabetically by the following 
cipher alphabet, yielding the cryptogram and the frequency distribution shown below. 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher. GHIJKLMNOPQRSTUVWXYZABCDEF 



Plain... 


HOSTI 


LEFOR 


GEEST 


IMATE 


DATON 


EREGI 


MENTI 


NFANT 


RYAND 


Cipher 


...NUYZO 


RKLUX 


IKKYZ 


OSGZK 


JGZUT 


KXKMO 


SKTZO 


TLGTZ 


XEGTJ 


Plain 


TWOPL 


ATOON 


SCAVA 


LRYMO 


VINGS 


OUTHO 


NQUIN 


NIMON 


TPIKE 


Cipher 


ZCUVR 


GZUUT 


YIGBG 


RXESU 


BOTMY 


UAZNU 


TWAOT 


TOSUT 


ZVOQK 


Plain 


STOPH 


EADOF 


COLUM 


NNEAR 


INGRO 


ADJUN 


CTION 


SEVEN 


THREE 


Cipher 


.YZUVN 


KGJUL 


lURAS 


TTKGX 


OTMXU 


GJPAT 


IZOUT 


YKBKT 


ZNXKK 


Plain 


SEVEN 


COMMA 


EASTO 


FGREE 


NACRE 


SCHOO 


LFIRE 


DUPON 


BYOUR 


Cipher 


YKBKT 


lUSSG 


KGYZU 


LMXKK 


TGIXK 


YINUU 


RLOXK 


JAVUT 


HEUAX 


Plain.... 


PATRO 


LSSTO 


PHAVE 


DESTR 


OYEDB 


RIDGE 


OVERI 


NDIAN 


CREEK 


Cipher 


VGZXU 


RYYZU 


VNGBK 


JKYZX 


UEKJH 


XOJIK 


UBKXO 


TJOGT 


IXKKQ 



148274—38 3 
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Cbtftoobah 



H U Y Z 0 
S K T Z 0 
R X E S U 

Y Z U V N 
I Z 0 U T 
L U X K K 

V G Z X U 
U B K X 0 



R K L U X 
T L G T Z 
BOTHY 
K G J U L 
Y K B K T 
T G I X K 
R Y Y Z U 
T J 0 G T 



I K K Y Z 
X E G T J 
U A Z N U 
I U R A S 
Z N X K K 

Y I N U U 

V N G B K 
I X K K Q 



0 S G Z K 
Z C U V R 
T W A 0 T 
T T K G X 
Y K B K T 
R L 0 X K 
J K Y Z X 



J G Z U T 
G Z U U T 
T 0 S U T 

0 T H X U 

1 U S S G 
J A V U T 
U E K J H 



K X K U 0 
Y I G B G 
Z V 0 Q K 
G J P A T 
K G Y Z U 
H E U A X 
X 0 J H K 



S£ a ^ 5 

ABODE 



I S i I 

F G H I J K 



^ g g 

L H N 0 

Itomi T 



P Q 



R S 



T U V W X Y Z 



d. Let tiie student now compare Figs. 6 and 7, which have been superimposed in Fig. 8 
for conyenience in examination. Crests and troughs are present in both distributions; moreover 
their relative positions and frequencies have not been changed in the slightest particular. Only 
the absolute position of the sequence as a whole has been displaced six intervals to the right in 
Fig. 7, as compared with the absolute positiou of the sequence in Fig. 6. 



5 S S: s 

I g g § g g g « g 

£g 5ig -g ^=sgg.. ggg^>. 

g%ggggggg^%ggggg^ggggg<. i 

ABCDEFGHIJKLUNOPQRSTUVWXYZ 



a ^ g g g .. s: 

I g ^ g 5 g g g 5: g 

."gg^ §.g%ggggggg^%ggggg^ggg 

A B, C P E F G H I J K L M N 0 P Q R S T U V W X Y Z 



Fiom a 

<..If the two distributions are compared in detail the student will clearly understand how 
easy the solution of the cryptogram would be to one who knew nothing about how it was prepared. 
For example; the fre'quency of the highest crest, representing Ep in Fig. 6 is 28; at an interval of 
four letters before there is another crest representing Ap with frequency 16. Between A and E 
th^ is a trough, representing the low-frequency letters B, C, D. On the other side of E, at an 
ijltervid of four letters, comw another crest, representing I with frequency 14. Between E and I 
there is another' trough, representing the low-frequency letters F, G, H. Compare these crests 
and troughs with tlieir homologous crests and troughs in Fig. 7. In the latter, the letter K 
iharks the highest crest in the distribufion with a frequency of 28 ; four letters before K there is 
ithother crest, frequency 16, and four letters bn the other ride of K there is another crest, frequency 
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14. . Trou|^ corresponding. to B, C, D and F, G, H are seen at H, I, J and L, II, N in Fig.>^7; ’ lii 
fact, the two distaibutions may be made to coincide exactly, by shifting the frequency distribution 
for the cryptogram six intervals to the left with respcet to the distribution for the equivalent 
plain-text message, as shown herewith. 



S ^ g g g ^ 

ig g ^ gg g^g 

ag 5ig -.g ^s=gg^ ggg-.-. 

g 5^ g g g g g g g -- 55 g g g g g ^ g g g g g § 

ABCOEFGHIJKLHNOPQRSTUVWXYZ 




GHIJKLUNOPQRSTUVWXYZABCDEF 



FtauuS. 

/. Let US suppose now that nothing is known about the cryptographing process, and that 
only die cryptogram and its uniliteral frequency distribution is at hand. It is dear that simply 
beating in mind the Spatial rdations of the crests and troughs in a normal frequency distribution 
would enable the ciyptanalyet to fit the distribution to the normal in this case. He would 
naturally first assume that G,=Ap, from vrhich it would follow that if a direct standard alphabet 
is involved, He=Bp, lo — Cp, and so on, yidding the following (tentative) dedphering alphabet: 

Cipher. ABCDEFGHIJKLMlfOPQRSTUVWXYZ 

Plain UVWXYZABCDEFGHIJKLMN0PQR3T 

g. Now comes the final trat: If these assumed values are substituted in the cipher text, 
the plain text immediatdy appears. Thus: 

NUYZO RKLUX IKKYZ OSGZK JGZUT etc. 

HOST I LEFOR C E E S T I H A T E D A T 0 N etc, 

A. .It should be dear, therefore, tlmt the sdection of G, to represent Ap in the cryptographing 
process has absolutdy no effect upon the relative spatial and frequency rdations of the crests 
and troughs of the frequency distribution for the cryptogram. If Qg had been sdected to repre- 
sent Ap, these relations would still remain the same, the whole series of crests and troughs bdug 
merdy displaced further to the right of the portions they occupy when Gg= Ap. 

19. Practical example of solution by the frequency niethod. — a. J%e case of direct standard 
alphabet ciphers. — (1) The following cryptogram is to be solved by appl3dng the foregoing 
prindples: 

IBMQO PBIUO MBBG A JCZO F M U U Q B A J C Z 0 
ZWILN QTTML EQBPU IZKPQ VOQVN IVBZG 

(2) From the presence of repetitions and so many low-frequency letters such as B, Q, and 
Z it is at once suspected that this is a substitution dpher. But to illustrate the steps that must 
be taken in difficult cases in order to be certain in this respect, a uniliteral frequency distribution 
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is constructed, and then reference is made to charts 1 to 4 to note whether the actual numbers 
of vowels, high, medium, and low-frequencj consonants fall inside or outside the areas delimited 
by the respective curves. 

g 

ABCOEFGHIJKLHNOPQRSTUVWXYZ 



Fiauu lOs. 



Letteis 


Ftequency 


Position with respect to areas 
driimited by curves 


Vowels (A E I 0 U Y) 


17 


Outside, chart 1. 




4 


Outside, chart 2. 




25 


Outside, chart 3. 




14 


Outside, chart 4. 


Total 1 


60 





(3) All four points falling quite outside the areas delimited by the curves applicable to these 
four dasses of letters, the cryptogram is clearly a substitution cipher. 

(4) The appearance of the frequency distribution, with marked crests and troughs, indicates 
that the cryptogram is probably monoalphabetic. Beference is now made to Chart 5. The 
message has 60 letters and 6 blanks. The point of intersection on the chart is closer to curve 
P than it is to curve therefore, this is additional evidence that the message is probably 
monoalphabetic. 

(5) The next step is to determine whether a standard or a mixed cipher alphabet is involved. 
This is done by studying the positions and the sequence of crests and troughs in the frequency 
distribution, and trykg to fit the distribution to the normal. 

(6) The first assumption to be made is that a direct standard is involved. The highest crest 
in the distribution is marked by B«. Let it be assumed that B,— Ep. Then C„ Dg, Eg, . . .=Fp, 
Gp, 1^, . . ., respectively; thus: 

^ ^ ^ 

apher ABODE FGHIJkLMNOPQRSTUVWXYZ 

Plain - DEFGHIJKLMNOPQRSTUVWXYZABC 

. ' irovu lok 

At first glance the approximation to the expected frequencies seems fair, especially in the r^on 
F G H IJ and R S Tp. But there are too many occurrences of Pp, Xp and Cp and too few 
occurrences of Ap, Np, Op. Moreover, if a substitution is attempted on this basis, the following 
is obtained for the first two dpher groups: 

apher. IBMQO PBIUO 

• “Plaintext" L EPTR SELXR 

This is certainly not plain text and it seems dear that B« is not Ep. A different assumption will 
have to be made. 

(7) Suppose Q«=Ep. Going through the same steps as before, again no satisfactory results 
areobtained. Further trials' are made along the same lines, until the assumption H,=Ep is tested. 

> It is tmneoesssry, of couise, to write oat the alphabets as shown in Figs. 10b and e when testing assumptions. 
This is usually all done mentally. 
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CSpher. ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Plain STUVWXYZABCDEFGHIJKLMNOPQR 



FtouuUe. 

(8) The fit in this case is quite good; possibly there are too many occurrences of Gp and 
and two few of Ep, Op and Sp. But the final test remains: trial of the substitution alphabet on the 
cryptogram itself. This is immediately done and the results are as follows: 



Cryptogram.... IBHQO PBIUO NBBGA JCZOF IIUUQB AJCZO 

Plain text. ATEIG HTAHG ETTYS BURGX EUUIT SBURG 

Cryptogram.... ZWILN QTTML EQBPU IZKPQ VOQVN IVBZG 

Plain text ROADF ILLED WITHM ARCHI NGINF ANTRY 



AT EIGHT AH GETTYSBURG-EUMITSBURG ROAD FILLED WITH MARCHING INFANTRY. 

(9) It is always advisable to note the specific key. In tins case the correspondence between 
any plain-text lettOT and its cipher equivalent will indicate the key. Although other conventions 
are possible, and equally valid, it is usual, however, to indicate the key by noting the cipher 
equivalent of Ap. In this case Ap= I,. 

b. The ease of reversed standard alphabet ciphers . — 

(1) Let the following cryptogram and its frequency distribution be studied. 

IPEAC BPIWC EPPKQ HORCL EWWAP QHORC 

RUIFD AXXEF MAPBW IRGBA VCAVD IVPRK 

(2) The prelimmaiy steps illustrated above, under subpar. a (1) to (4) inclusive, m connec- 
tion with the test for class and monoalphabeticity, will here be omitted, since they are exactly 
the same in nature. The result is that the cryptogram is obviously a substitution cipher and is 
monoalphabetic. 

(3) AaaiiTning that it is not known whether a direct or a reversed standard alphabet is in- 
volved, attempts are at once made to fit the frequenry distribution to the normal direct sequence. 
If the student will try them he will soon find out that these are unsuccessful. All this takes but a 
few minutes. 

(4) The next logical assumption is now made, viz, that the dpher alphabet is a reversed 
standard alphabet. When on this basis Eg is assiuned to be Ep, the distribution can readily be 
fitted to the normal, practically every crest and trough in the actual distribution corresponding 
to a crest or trough in the expected distribution. 



apher ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Plain IHGFEDCBAZYXWVUTSRQPONMLKJ 



Figttbi 1M. 



(5) When the substitution is made in the cryptogram, the following is obtained. 



Cryptogram IPEAC BPIWC EPPKQ 

Plain text ATEIG HTAMG ETTYS 



(6) The plain-text message is identical with that under paragraph a. The specific key in 
this case is also Ap=Ia. If the student will compare the frequency distributions in the two cases, 
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he will note that the relative positions and extensions of the crests and troughs are identical; 
they merely progress in opposite directions. 

80. Solution by completing the plain-component sequence. — a. The ease of direct standard 
alphabet ciphers. — (1) The foregoing method of analysis, involving as it does the construction of 
a uniliteral frequency distribution, was termed a solution by the frequency method because it in- 
volves the construction of a frequency distribution and its study. There is, however, another 
method which is much more rapid, almost wholly mechanical, and which, moreover, does not 
necessitate the construction or study of any frequency distribution whatever. An understand- 
ing of the method follows from a consideration of the method of encipherment of a message 
by the use of a single, direct standard cipher alphabet. 

(2) Note the following encipherment: 

Message REPEL INVADING CAVALRY 

ENcrPHBsrNa Alphabet 

Plain. ABODE FGHIJKLMNOPQRSTUVWXYZ 

apher- GHIJKLMNOPQRSTUVWXYZABCDEF 

Enciphebmbnt 

Plain text REPEL INVADING CAVALRY 

Cryptogram.... XKVKR OTBGJOTM IGBGRXE 

Cbtptooram 

XKVKR OTBGJ OTHIG BGRXE 

(3) The enciphering alphabet shown above represents a case wherein the sequence of letters 
of both components of the cipher alphabet is the normal sequence, with the sequence forming the 
cipher component merely shifted six intervals in retard (or 20 intervals in advance) of the posi- 
tion it occupies in the normal alphabet. If, therefore, two strips of paper bearing the letters of 
the normal sequence, equally spaced, are regarded as the two components of the cipher alphabet 
and are juxtaposed at all of the 25 possible points of coincidence, it is obvious that one of these 
25 juxtapositions must correspond to the actual juxtaposition shown in the enciphering alphabet 
directly above.* It is equally obvious that if a record were kept of the results obtained by ap- 
plying the values given at each juxtaposition to the letters of the cryptogram, one of these results 
would yield the plain text of the cryptogram. 

(4) Let the work be systematized and the results set down in an orderly manner for exam- 
ination. It is obviously unnecessary to juxtapose tire two components so that Aa=Ap, for on 
the assumption of a direct standard alphabet, juxtaposing two direct normal components at 
their normal point of coincidence merely yields plain text. The next possible juxtaposition, 
therefore, is A,=Bp. Let the juxtaposition of the two sliding strips therefore be A,=Bp, as shown 
here: 

Plain... ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher ABCDEFGHIJKLMNOPQRSTUVWXYZ 

The values given by this juxtaposition are substituted for the first 20 letters of the cryptogram 
and the following results are obtained. 

Cryptogram XKVKR OTBGJ OTMIG BGRXE 

1st Test— “Plaintext” YLWLS PUCHK PUNJH CHSYF 

* One of the strips should bear the sequence repeated. This permits Juxtaposing the two sequences at all 26 
possible points of ooinddenoe so as to have a complete cipher alphabet showing at all times. 
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This certainly is not intelligible text; obviously, the two components were not in the position 
indicated in this first test. The cipher component is ther^ore slid one interval to the right, 
making A.=Cp, and a second test is made. Thus 



Hain ABCDEFGHIJKLHNOPQRSTUVWXYZABCISFGailJKLUNOPQRSTUVIKyZ 

Cipher. ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Ci^togram XKVKR OTBGJ OTMIG BGRXE 



2dTesW‘Plaintext” ZMXMT QVDIL QVOKI DITZG 

Neither does the second test result in disclosing any plun text. But, if the results of l^e two 
tests are studied a phenomenon that at first seems quite puzzling comm to light. Thus, suppose 
the results of the two tests are superimposed in this fashion. 

Cryptogram XKVKR OTBGJ OTMIG BGRXE 

1st Test— “Plain text”.... Y L W L S PUCHK PUNJH CHSYF 

2nd Test— “Plaintext”... ZMXMT QVDIL QVOKI DITZG 

(5) Note what has happened. The net result of the two experiments was merely to continue 
the normal sequence begun by the cipher letters at the heads of the several columns. It is 
obvious that if the normal sequence is completed in each column (he results will be exactly the same 
as though the whole set oj 25 possible tests had actually been perjormed. Let the columns therefore 
be completed, as shown in Fig. 11. 



X 




_V 


_K 


JR 




JT 


B 


G 


jj 


JO 


T 


M 


1_ 


_G 




_G 


_R 




E 


Y 


L 


W 


L 


S 


P 


U 


C 


H 


K 


P 


U 


N 


J 


H 


C 


H 


S 


Y 


"f 


Z 


M 


X 


M 


T 


Q 


V 


D 


I 


L 


Q 


V 


0 


K 


I 


D 


I 


T 


Z 


G 


A 


N 


Y 


N 


U 


R 


W 


E 


J 


M 


R 


W 


P 


L 


J 


E 


J 


U 


A 


H 


B 


0 


Z 


0 


V 


S 


X 


F 


K 


N 


S 


X 


Q 


M 


K 


F 


K 


V 


B 


I 


C 


P 


A 


P 


W 


T 


Y 


G 


L 


0 


T 


Y 


R 


N 


L 


G 


L 


W 


C 


J 


D 


Q 


B 


Q 


X 


U 


Z 


H 


M 


P 


U 


Z 


S 


0 


M 


H 


M 


X 


D 


K 


E 


R 


C 


R 


Y 


V 


A 


I 


N 


Q 


V 


A 


T 


P 


N 


I 


N 


Y 


E 


L 


F 


S 


D 


S 


Z 


W 


B 


J 


0 


R 


W 


B 


U 


Q 


0 


J 


0 


Z 


F 


M 


G 


T 


E 


T 


A 


X 


C 


K 


P 


S 


X 


C 


V 


R 


P 


K 


P 


A 


G 


N 


H 


U 


F 


U 


B 


Y 


D 


L 


Q 


T 


Y 


D 


W 


S 


Q 


L 


Q 


B 


H 


0 


I 


V 


G 


V 


C 


Z 


E 


M 


R 


U 


Z 


E 


X 


T 


R 


M 


R 


C 


I 


P 


J 


W 


H 


W 


D 


A 


F 


N 


S 


V 


A 


F 


Y 


U 


S 


N 


S 


D 


J 


Q 


K 


X 


I 


X 


E 


B 


G 


0 


T 


W 


B 


G 


Z 


V 


T 


0 


T 


E 


K 


R 


L 


Y 


J 


Y 


F 


C 


H 


P 


U 


X 


C 


H 


A 


W 


U 


P 


U 


F 


L 


S 


M 


Z 


K 


Z 


G 


D 


I 


Q 


V 


Y 


D 


I 


B 


X 


V 


Q 


V 


G 


M 


T 


N 


A 


L 


A 


H 


E 


J 


R 


W 


Z 


E 


J 


C 


Y 


W 


R 


W 


H 


N 


U 


0 


B 


M 


B 


I 


F 


K 


S 


X 


A 


F 


K 


D 


Z 


X 


S 


X 


I 


0 


V 


P 


C 


N 


C 


J 


G 


L 


T 


Y 


B 


G 


L 


E 


A 


Y 


T 


Y 


J 


P 


W 


Q 


D 


0 


D 


K 


H 


M 


U 


Z 


C 


H 


M 


F 


B 


Z 


U 


Z 


K 


Q 


X 


*R 


E 


P 


E 


L 


I 


N 


V 


A 


D 


I 


N 


G 


C 


A 


V 


A 


L 


R 


Y 


S 


F 


Q 


F 


M 


J 


0 


W 


B 


E 


J 


0 


H 


D 


B 


W 


B 


M 


S 


Z 


T 


G 


R 


G 


N 


K 


P 


X 


C 


F 


K 


P 


I 


E 


C 


X 


C 


N 


T 


A 


U 


H 


S 


H 


0 


L 


Q 


Y 


D 


G 


L 


Q 


J 


F 


D 


Y 


D 


0 


U 


B 


V 


I 


T 


I 


P 


M 


R 


Z 


E 


H 


M 


R 


K 


6 


E 


Z 


E 


P 


V 


C 


W 


J 


U 


J 


Q 


N 


S 


A 


F 


I 


N 


S 


L 


H 


F 


A 


F 


Q 


W 


D 



riGvu 11. 



An examination of the successive horizontal lines of the diagram discloses one and only one line 
of plain text, that marked by the asterisk and reading REPELINVADINGCAVA L R Y . 
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(6) Since each column in Mg. 11 is nothii^ but a normal sequence, it is obvious that instead 
of laboriously writing down these columns of letters every time a cryptogram is to be examined, 
it would be more convenient to prepare a set of strips each bearing the normal sequence doubled 
(to permit complete coincidence for an entire alphabet at any setting), and have them available 
for examining any future cryptograms. In using such a set of sliding strips in order to solve a 
cryptogram prepared by means of a single direct standard cipher alphabet, or to make a test to 
determine whether a cryptogram has been so prepared, it is only necessary to ”set up’' the letters 
of the cryptogram on &e strips, that is, align them in a single row across the strips (by sliding 
the individual strips up or down). The successive horizontal lines, called generairiees (singular, 
generatrix), are then examined in a search for intelligible text. If the cryptogram really belongs 
to this simple type of cipher, one of the generatrices will exhibit inte^gible text all the way 
acrras; this text will practicidly invariably be the plain text of the message. This method of 
analysis may be termed a eohclion by completing the plain-eomponent segvence. Sometimes it is 
referred to as “running down” the sequence. The principle upon which the method is based 
constitutes one of the cryptanalyst’s most valuable tools.* 

b. The case of reversed standard cdphahets. — (1) The method described under subpar. a may 
dso be applied, in di^tly modified form, in the case of a cryptogram enciphered by a single 
reversed standard alphabet. The basic principles are identical in the two cases. 

(2) To show this it is necessary to experiment with two sliding components as before, except 
that in this case one of the components must be a reversed normal sequence, the other, a direct 
normal sequence. 

(3) Let the two components be juxtaposed A to A, as shown below, and then let the resultant 
values be substituted for the letters of the cryptogram. Thus: 

Cbtftogbam 

PCRCV YTLGD YTAEG LGVPI 

Plain . ABOTEFGHI JKLMN0PQRSTUVW3CYZABCDEFGHI JKLMNOPQRS'TUVmZ 

Cipher ZYXWVUTSRQPONMLKJIHGFEDCBA . 

Cryptogram PCRCV YTLGD YTAEG LGVPI 

1st Test— “Plain text”— LYJYF CHPUX CHA W U P U F L S 

(4) This does not yield intelligible text, and therefore the reversed component is slid one 
space forward and a second test is made. Thus: 

Plain. ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 

apher. ZYXWVUTSRQPONMLKJIHGFEDCBA 

Cryptogram- P C R C V Y T L G D Y T A E G LGVPI 

2d Testr-“Plain text”.— M Z K Z G D I Q V Y D I B X V Q V G M T 

(5) Nmther does the se(H>hd test yield intelligible text. But let the results of the two tests 
be superimposed. Thus: 

Cryptogram PCRCV YTLGD YTAEG LGVPI 

1st Test— “Plain text”— LYJYF CHPUX CHAWU PUFLS 

2d Te st— “Plain text”— MZKZG DIQVY DIBXV QVGMT 

. * It is recommended that the student prepare a set of 25 strips K by by 16 inohes, made of well-seasoned 

wood, and ^ue a4>habet strips to the wood. The alphabet on each strip should be a double or repeated alphabet 
With all letters equflllir spaced. 
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, (6) It is seen that the letters of the “plain text” pven by the second trial are merely the 

opntinuante of the normal sequences initiated by the letters.of the “plain text” given by the first 
trial. If these sequences are “run down” — that is, completed within the columns — ^the results 
must obvious!^ be the same as though successive tests exactly aimilar to the first two were 
ipplied to the cryptogram, using one revised normal and one direct normal component. If the 
cryptogram has really been prepared by means of a smgle reversed standard alphabet, one of 
generatrices of the diagram that results from completing the sequences must yield intelligible 
tlWt* • 

. <7) Let the diagram be made,, or better yet, if the student has already at hand the set of 
diding strips referred to in the footnote to page 36, let him “set up” the letters given by the 
jirri trial. Fig. 12 diows the diagram and indicates the plain-text generatrix. 

P C R C V Y T L G D Y T A E G L G V P I 
L Y J Y F C H P U X C H A W U P U F L S 
MZKZGDIQVYDIBXVQVGHT 
NALAHEJRWZEJCYWRWHNU 
OBMBIFKSZAFKDZXSXIOV 
PCNCJGLTYBGLEAYTYJPW 
QDODKHMUZCHllFBZUZKQX 
*REPELINVADINGCAVALRY 
, S F Q F M J 0 W B E J 0 H D B W B U S Z 

TGRGNKPXCFKPIECXCNTA 
UHSHOLQYDGLQJFDYDOUB 
VXTIPMRZEHHRKGEZEPVC 
WJUJQNSAFINSLHFAFQWD 
XKVKROTBGJOTMIGBGRXE 
YLWLSPUCHKPUNJHCHSYF 
ZHXIITQVDILQVOKIDITZG 
A N Y N U R W E J H R V P L J E J U A H 
BOZOVSXFKNSXQMKFKVBI 
CPAPWTYGLOTYRNLGLWCJ 
DQBQXUZHHPUZSOMHHXDK 
ERCRYVAINQVATPNINYEL 
FSDSZWBJORWBUQOJOZFH 
GTETAXCKPSXCVRPKPAGN 
HUFUBYDLQTYDWSQLQBHO 
IVGVCZEHRUZEXTRMRCIP 
JWHWDAFNSVAFYUSNSDJQ 
KXIXEBGOTWBGZVTOTEKR 



FlOVU 12. 



(8) The only difference in procedure between this case and the preceding one (where the 
cipher alphabet was a direct standard alphabet) is that the letters of the cipher text are first 
"deciphered” by means of any reversed standard alphabet and then the columns are “nm down”* 
accordmg to the normal A B C . . . Z sequence. For reasons which will become apparent very 
sooni the first step in this method is technically termed converting the cipher letters into their 
plainreomponent equivalents; the second step is the same as before, viz, completing the plain-corn^ 
ponent sequence. 
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81 . Special remarks on the method of solution by completing the plainrcomponent sequence. — 

a. The terms employed to designate the steps in the solution set forth in Par. 206, viz, “con- 
verting the cipher letters into their plain-component equivalents” and “completing the plain- 
component sequence”, accurately describe the process. Their meaning will become more dear 
as the student progresses with the work. It may be said that whenever the plain component of 
a cipher alphabet is a known, sequence, no matter how it is composed, the di£5.culty and time 
required to solve any cryptogram involving the use of that plain component is practically cut 
in half. In some eases this knowledge facilitates, and in oiher eases is the only thing that makes 
poss&de the sohaion of a very short cryptogram that might otherwise d^y solution. Later on an 
example will be given to illustrate what is meant in this regard. 

b. The student should take note, however, of two qualifying expressions that were employed 
in a preceding paragraph to describe the results of the application of the method. It was stated 
that “one of the generatrices will exhibit intelligible text all the way across; this text will practically 
invariably be the plain text.” 'Will there ever be a case in which more than one generatrix will 
yield intelligible text throughout its extent? That obviously depends almost entirely on the 
number of letters that are aligned to form a generatrix. If a generatrix contains but a very few 
letters, only five, for example, it may happmi as a result of pure chance that there will be two or 
more generatrices showing what might be “intelligible text.” Note in Fig. 11, for example, that 
there are several cases in which 3-letter and 4-letter English Words (ANY, VAIN, GOT, TIP, etc.) 
appear on generatrices that are not correct, these words being formed by pure chance. But there 
is not a single case, in this diagram, of a 6-letter or longer word appearing fortuitously, because 
obviously the longer the word the smaller the probability of its appearance purdy by chance; 
and the probability that two generatri<di of 15 letters each will both yield intelligible text along 
their entire length is exceOdingly remote, so remote, in fact, that in practical cryptography such 
a cose may be considered nonexistent.* 

c. The student ^ould observe that in reality ^ere is no difference whatsoever in principle 
between the two methods presented in subpara, a and b of Far. 20. In the former the preliminary 
step of converting the cipher letters into their plain-component equivalents is apparently not 
present but in reality it is there. The reason for its apparent absence is that in that case the 
plain component of ^e cipher alphabet is identical in all respects with the cipher component, so 
that the cipher letters require no conversion, or, rather, they are identical with the equivalents 
that would result if they were converted on the basis A,=sAp. In fact, if the solution process had 
been arbitrarily initiated by convdting the cipher letters into their plain-component equivalents 
at the setting A,=0p, for example, and the cipher component slid one interval to the right there- 
after, the results of ^e first and second tests of Par. 20a would be as follows: 

Cryptogram..... X K V K R 0 T B G J 0 T M I G B G R X E 

1st Test— “Plain text”.. L Y JYPCHPUXCHAWUPUFLS 

2nd Test— “Plaintext”. M ZKZGDIQVYDIBXVQVGMT 

Thus, the forcing diagram duplicates in every particular the diagram resultii^ from the first 
two tests under Far. 206: a first line of cipher letters, a second line of letters derived from them 
but showing externally no relationship with the first line, and a third line derived immediately 
from the second line by continuing the direct normal sequence. This point is brought to attention 
only for the purpose of showmg that a single, broad principle is the basis of the general method of 
solution by completing the plain-component sequence, and once the student has this firmly in 

* A person with patience and an inclination toward the curiosities of the science might construct a text of. 16 
or more letters which would yield two “intelligible’' texts on the plain-component completion diagram. 
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mind he will have no difficulty whatsoever in realizing when the principle is applicable, what a 
powerful cryptanalytic tool it can be, and what results he may expect from its application in 
specific instances. 

d. In the two foregoing examples of the application of the principle, the plain component 
was a normal sequence but it should be clear to the student, if he has grasped what has been said 
in the preceding subparagraph, that this component may be a mixed sequence which, if known 
(that is, if the sequence of letters comprising the sequence is known to the cryptanalyst), can be 
handled just as readily as can a plain component that is a normal sequence. 

e. It is entirely immaterial at what points the plain and the cipher components are juxtaposed 
in the preliminary step of converting the cipher letters into their plain-component equivalents. 
For example, in the case of the reversed alphabet cipher solved in Par. 206, the two components 
were arbitrarily juxtaposed to give the value A=A, but they might have been juxtaposed at any 
of the other 25 possible points of coincidence without in any way affecting the l^al result, viz, the 
production of one plain-text generatrix in the completion diagram. 

22. Value of mechanical solution as a short out. — a. It is obvious that the very first step 
the student riiould take in his attempts to solve an unknown cryptogram that is obviously a 
substitution cipher is to tiy the mechanical method of solution by completing the plain-component 
sequence, namg the normal alphabet, first direct, then reversed. This takes only a very few 
minutes and is conclusive in its results. It saves the labor and trouble of constructing a frequency 
distribution in case the cipher is of this simple type. Later on it will be seen how certain varia- 
tions of this simple type may also be solved by the application of this method. Thus, a very 
easy short cut to solution is afforded, which even the experienced cryptanalyst never overloolm 
in ^ first attack on an unknown cipher. 

6. It is important now to note that ij neither of the two foregoing attempte is sueees^ul in 
bringing jlain text to light and the eryptogram is quite obviously rnonoidphahetic in eharaeter, the 
cryptanalyst is warranted in assuming that the eryptogram involves a mixed cipher cdphabet.* The 
steps to be taken in attacking a cipher of the latter type will be discussed in the next section. 

' There is but one other possiUUty, already referred to under Far. 17d, which involves the case where trans- 
position and monoalphabetic substitution processes have been applied in successive steps. This is unusual, 
however, and will be discussed in its proper place. 
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UNILITERAL SUBSTITUTION WITH MIXED CIPHER ALPHABETS 

Paragraph 

Basic reason for the low degree of cryptographic security afforded by monoalphabetic cryptograms involving 



standard cipher alphabets 23 

Preliminary steps in the analysis of a monoalphabetio, mixed-alphabet cryptogram 24 

Further data concerning normal plain text 26 

Preparation of the work sheet 26 

Triliteral-frequency distributions 27 

Classifying the cipher letters into vowels mid consonants 28 

Further analysis of the letters representing vowels and consonants 29 

ftubstituting deduced values in the cryptogram — 30 

Completing the solution 31 

General remarks on the foregoing solution 32 

The “probable-word” method; its value and applicability 33 

Solution of additional cryptograms produced by tiie same cipher component 34 



28. Basie reason for the low degree of eryptographio seonrity afforded by monoalphabetie 
cryptograms involving standard oipher alphabets. — ^The student hss seen that the solution of 
monoalphabetic cryptograms involving standard cipher alphabets is a very easy matter. Two 
methods of analysis were described, one involving the construction of a frequency distribution, 
the. other not requiring this kind of tabulation, being almost mechanical in nature and corre- 
spondingly rapid. In the first of these two methods it was necessary to make a correct assumption 
as to the value of but one of the 26 letters of the cipher alphabet and the values of the remaining 
25 letters at once become known; in the second method it was not necessary to assume a value 
for even a single dpher letter. The student should understand what constitutes the ba^ of this 
dtuation, viz, the fact that the two components of the cipher al^lmbet are' composed of knmn 
seguences. What if one or both of these components are, for the cryptanalyst, vnknovm zegnenieal 
In other words, what difficulties will confront the cryptanalyst if the dipher component of the 
cipher alphabet is a mixed sequence? Will such an alphabet be solvable as a whole at one stroke, 
or will it be necessary to solve its values individually? Since the determination of the value of 
one cipher letter in this case gives no direct clues to the value of any other letter, it would seem 
that the solution of such a cipher ^ould involve considerably more analysis and e:q>eiiment than 
has the solution of either of the two types of dphers so far examined occasioned. A typical 
example will be studied. 

24. Preliniinary steps in the analysis of a monoalphabetic, mixed alphabet cryptogram. — 
a. Note the following cryptogram: 

SFDZF lOGHL PZFGZ DYSPF HBZDS GVHTF UPLVD FGYVJ VFVHT GADZZ AITYD 

ZYFZJ ZTGPT VT ZBD VFHTZ DFX 5B GIDZY VTXOI YVTEF VMGZZ THLLV XZDFM 

HTZAI TYDZY BDVFH TZDFK ZDZZJ SXISG ZYGAV FSLGZ DTHHT CDZRS VTYZD 

OZF FH TZAIT YDZYG AVDGZ ZTK HI TYZ YS DZGHU ZFZTG UPGDI XWGHX ASRUZ 

DFUID EGHTV EAGXX 

b. A casual inspection of the text discloses the presence of several long repetitions as well as 
of many letters of normally low frequency, such as F, G, V, X, and Z; on the other hand, letters of 

(40) 
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normally high frequency, such as the vowels, and the consonants N and R, are rdatively scarce. 
The cryptogram is obviously a substitution cipher and the tunial mechanical tests for determining 
whether it is possibly of the monoalphabetic, standard>alphabet type are applied. The results 
being negative, a uniliteral frequency distribution is immediately constructed and is as ehown 
in Figure 13. 

g 

^ i ... g ^ ^ g ^ a. ^ g ^ i I ^ g g g 

A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 

84iaSl»UU10S2SS0SS 02Uai<U18 14U 

TiotiBiU 

e. The fact that the frequency distribution shows very marked crests and troughs means 
that the cryptogram is undoubtedly monoalphabetic; the fact that it has already been tested 
(by the method of completing the plain-component sequence) and found not to be of the mono- 
alphabetic, standard-alphabet type, indicates with a high degree of probability that it involves 
a mired cipher alphabet. A few moments might be devoted to making a careful inspection of the 
distribution to insm:e that it cannot be made to fit the normal; the object of this would be to 
rule out the possibility that the text resulting from substitution by a standard cipher alphabet 
had not subsequently been transposed. But this inspection in this case is hardly necessary, in 
view of the presence of long repetitions in the message.* (See Par. 13p.) 

d. One might, of course, attempt to solve the cryptogram by applying the simple principle^ 
of frequency. One might, in other words, assume that Z, (the letter of g;reatest frequency) 
represents Ep, D, (the letter of next greatest frequency) represents T„, and so on. If the message 
were long enough this simple procedure might more or less quickly give the solution. But the 
message is relatively short and many difficulties would be encountered. Much time and effort 
would be expended unnecessarily, because it is hardly to be expected that in a message of only 
235 letters the rdative ord» of frequency of the various cipher letters should exactly coincide 
with, or even closely approximate the relative order of frequency of letters of normal plain text 
found in a coimt of 50,000 letters. li is to be emphasized thaJt the beginner must r^ess the natural 
tendency to place too much confidence in the generalized principles oj frequency and to rely too much 
upon them. It is far better to bring into effective use certain other data concerning normal 
plain text which thus far have not been brought to notice. 

25. Further data concerning normal plain text. — a. Just as the individual letters constituting 
a large volume of plain text have more or less characteristic or fixed frequencies, so it is found 
that digraphs and trigraphs have characteristic frequencies, when a large volume of text is 
studied statistically. In Appendix 1, Table 6, are shown the relative frequencies of all digraphs 
appearing in the 260 telegrams referred to in Paragraph 9e. It will be noted that 428 of the 676 
possible pairs of letters occur in these telegrams, but whereas many of them occur but once or 
twice, there are a few which occur hundreds of times. 

b. In Appendix 1 will also be found several other kinds of tables and lists which will be useful 
to the student in his work, such as the relative order of frequency of the 50 digraphs of greatest 

> This possible step is mentioiied here for the purpose of making it clear that the plain-component sequence 
completion method cannot solve a case in which transposition has followed or preceded monoalphabetid substi- 
tution with standard alphabets. Cases of this kind will be discussed in a later text. It is sufScient to indicate 
at this point that the frequency distribution for such a combined substitution-transposition cipher would present 
the characteristics of a standard alphabet cipher — ^and yet the method of completing the pUln-component 
sequence would fail to bring out any plain text. 
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frequency, the relative order of frequency of doubled letters, doubled vowels, doubled consonants, 
and so on. It is suggested that the student refer to this appendix now, to gain an idea of the 
data available for bis future reference. Just how these data may be employed will become ap- 
parent very shortly. 

26. Preparation of the work sheet. — a. The details to be considered in this paragraph may 
at first appear to be superfluous but long experience has proved that systematization of the 
work, and preparation of the data in the most utilizable, condensed form is most advisable, even 
if tlus seems to take considerable time. In the first place if it merely serves to avoid interrup- 
tions and irritations occasioned by failure to have the data in an instantly available form, it 
will pay by saving mental wear and tear. In the second place, especially in the case of com- 
plicated cryptograms, painstaking care in these details, while it may not always bring about 
success, is often the factor that is of greatest assistance in ultimate solution. The detailed 
preparation of the data may be irksome to the student, and he may be tempted to avoid as much 
of it as possible, but, unfortunately, in the early stages of solving a cryptogram he does not know 
(nor, for that matter, does the expert always know) just which data are essential and which 
may be neglected. Even though not all of the data may turn out to have been necessary, as a 
general rule, time is saved in the end if all the usual data are prepared as a regular preliminary 
to the solution of most cryptograms. 

(. First, the cryptogram is recopied in the form of a work sheet. This sheet should be of a 
good quality of paper so as to withstand considerable erasure. If the cryptogram is to be 
copied by hand, cross-section paper of K-inch squares is extremely useful. The writing should 
be in ink, and plain, carefully made roman capital letters should be used in all cases. If the 
cryptogram is to be copied on a typewriter, the ribbon employed shoxild be impregnated with an 
ink that will not smear or smudge under the lumd. 

e. The arrangement of the characters of the cryptogram on the work ^eet is a matter of 
contiderable importance. If the cryptogram as first obtained is in groups of regular length 
(usually five characters to a group) and if the uniliteral frequency distribution shows the crypto- 
gram to be monoalphabetic, the characters ^ould be copied without regard to this grouping. 
It is advisable to allow two spaces between letters, and to write a constant number of letters 
per line, approximately 25. At least two spaces, preferably tiiree spaces, should be left between 
horizontal lines. Care should be taken to avoid crowding the letters in any case, for this is 
not only confusing to the eye but also mentally irritating when later it is found that not enough 
space has been leftfor making various sorts of marks or indications. If the cryptogram is origi- 
nally in what appears to be word lengths (and this is the case, as a rule, only with the cryptograms 
of amateurs), naturally it should be copied on the work sheet in the original groupings. If 
further study of a cryptogram shows tiiat some special grouping is required, it is often best to 
reeopy it on a fresh work sheet rather than to attempt to indicate the new grouping on the old 
work sheet. 

d. In order to be able to locate or refer to specific letters or groups of letters with speed, 
certainty, and without possibility of confxision, it is advisable to use coordinates applied to the 
lines and columns of the text as it apx>ears on the work sheet. To minimize possibility of con- 
fusion, it is best to epply letters tp the horizontal lines of the text, numbers to the vertical col umns , 
in referring to a letter the horizontal line in which the letter is located is usually given first. Thus, 
referring to the work sheet shown below, coordinates A17 designate the letter Y, the 17th letter 
in the first line. The letter I is usually omitted from the series of line indicators so as to avoid 
confusion .with the figure 1. If lines are limited to 25 letters each, then each set of 100 letters of 
the text is automatically blocked off by remembering that 4 lines constitute 100 letters. 

e. Above e^h chapter of the dph^ text may be some indication of the frequency of that 
character in the whole cryptogpr^. This indication nuiy be the Mtual number of times the 
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charOiCtOT ok^utBi or,, if colored pencils are used, the cipher letters may be divided up into three 
categories or groups— high frequency, medium frequency, and low frequency. It is perhaps 
simpler, if clerical help is available, to indicate the actual frequencies. This saves constant 
reference to the frequency tables, wBch interrupts the train of thought, and saves considerable 
time in the end. 

/. After the special frequency distribution, erplaiaed in Par. 27 below, has been constructed, 
repetitions of digraphs and trigraphs should be underscored. In so doing, the student should be 
particularly watchful of trigraphic repetitiona which can be further extended into tetrag^apbs 
and polygraphs of greater length. Repetitions of more than ten characters should be set off by 
heavy vertical , lines, as they indicate repeated phrases and are of considerable assistance in 
solution. If a i^petition continues from one line to. the next, put an arrow at the end of the 
underscore to signal this fact. Reversilde digraphs should also be indicated by an underscore 
with an arrow pointiug in both directions. Anything which strikes the eye as being peculiar, 
Tmusual, or significant as regards the distribution or recurrence of the characters should be 
noted. All these marks should, if convenient, be made with ink so as not to cause smudging. 
The work sheet will now appear as shown herewith (not all the repetitions are underscored): 





1 1 


1 4 


5 0 


7 


8 


9 


10 


11 


u 


a 


14 


u 


18 17 


18 


19 


a a 


a 


a M a 




10 u 


a u 


19 10 


8 


19 


15 


5 


5 


a 


19 


19 


a 


a 14 


10 


5 


19 u 


4 


a a 10 


A 


S F 

4- 


D Z F I 


0 G H L P 


Z F 


G 


Z D Y S 


P 


F H B 


ZDS 




U M 


IS a 


19 5 


0 


5 


Id 


a 


19 


19 


14 


10 


8 


U 19 


18 


15 


a 19 


8 


a a a 


B 

j 


G V H T F U 


P 


L V D F 
• • 1 


G Y 


V 


J 


V F V 


H 


T G A 


D Z Z 




8 10 


a 14 


a a 


14 


19 


a 


8 


a 


a 


19 


0 


a 


u a 


a 


4 


a 18 


19 


u a a 


C 


A I. 


T Y D Z Y F Z 


J 


Z T G 


p 


T V T 


Z 


B D V F H T Z 




a 19 


8 U 


4 19 


10 


a 


a 


14 


10 


a 


8 


8 


10 


14 10 


a 


8 


19 18 


3 


19 a a 


D 


D F X S 


B G 


I' 


D 


Z 


Y V T X 


0 


I 


Y V T 


E 


F V M G Z Z 




23 U 


5 5 


10 8 


a 


a 


19 


2 


u 


22 


a 


8 


10 


a 14 


a 


a 


14 4 


a 


10 19 U 


E 


T H L L 


V X Z D 


I. 


M H T 




A 


x 


T Y D 


X 


i. 


X. 


V F H 




a 35 


a 19 


8 a 


a 


a 


a 


8 


10 


8 


10 


10 


19 


a 14 


19 


8 


18 19 


10 


5 19 a 


F 


,T Z 


D F 


K Z 


D 


Z 


Z 


J 


S 


X 


I 


S 


G Z Y 


G 


A 


V F 


S 


L G Z 




a a 


15 15 


a 1 


a 


a 


3 


10 


10 


a 


14 


a 


a 


8 a 


19 


19 


15 a 


a 


8 10 a 


G 


D T H H 


T C 


D 


Z R 


S 


V 


T 


Y 


Z 


D 


0 Z 


F 


F 


H T 


X 


AIT 




14 a 


a 14 


19 8 


10 


a 


19 


a 


a 


a 


3 


u 


10 


a 14 


a 


14 


10 a 


a 


10 IS 5 


H 


,Y D 


Z Y 


G A V D 




Z 


Z 


T 


K 


H 


I. 


T Y 


Z 


Y 


S D Z G H U 




a 19 


a a 


19 5 


5 


19 


a 


10 


8 


1 


19 


IS 


8 


8 10 


3 


5 


a a 


19 


5 10 a 


J 


Z F 


Z T 


G U 


P 


G D 


I 


X 


W 


G 


H 


X A S 


R 


U 


Z D 


j: u I D 




8 19 


u a 


10 8 


8 


19 


8 


8 
























K 


E G H T 


V E A G X 


X 

























27. Triliteral'freqaency distributions. — a. In what has gone before, a type of frequency 
distribution known as a uniliteral frequency distribution was used. This, of course, shows only 
the number of times each individual letter occurs. In order to apply the normal digrapbic and 
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trigraphio frequency data ^ven in Appendix 1) to the solution of a cryptogram of the type now 
being studied, it is obvioua that the data with respect to digraphs and trigraphs occurring in the 
cryptogram should be compiled and should be compared with the data for normal plain text. In 
order to accomplish this in suitable manner, it is advisable to construct a sli^tly more com- 
plicated form of distribution termed a triliteral frequency distribution.* 

b. Given a cryptogram of 50 or more letters and the task of determining what trigraphs are 
present in the cryptogram, there are three ways in which the data may be arranged or assembled. 
One may require that the data show (1) each letter with its two succeeding letters; (2) each letter 
with its two preceding letters; (3) each letter with one preceding letter and one succeeding letter. 

e. A distribution of the first of the three foregoing types may be designated as a “triliteral 
frequency distribution showing two sufiBxes”; the second type may be designated as a “tri- 
literal frequency distribution showing two prefixes”; the third type may be des^ated as 
a “triliteral frequency distribution riiowing one prefix and one suffix.” Quadriliteral and 
pentaliteral frequency distributions may occasionally be found useful. 

d. Which of these three arrangements is to be employed at a qiedfic time depends largely 
upon what the data are intended to riiow. For present purposes, in coimection with the solution 
of a monoalphabetio substitution cipher employing a mixed alphabet, possibly the third arrange- 
ment, that showing one prefix and one suffix, is most satisfactory. 

e. It is convenient to use cross-section paper for the construction of a triliteral fre- 

quency distribution in the form of a distribution riiowii^ crests and troughs, such as that in 
Figure 14. In that figure the prefix to each letter to be recorded is inserted in the left half of the 
cell directly above the dpher letter being recorded; the suffix to each letter is inserted in the right 
half of the cell directly above the letter bring recorded; and in each case the prefix and the 
suffix to the letter being recorded occupy ihe same cell, the prefix being directly to the left of the 
suffix. The number in parentheses gives the total frequency for each letter. 

* Heretoforo such a distribution has been termed a “trigraphic frequency table.” It is thought that the word 
“triliteral” is mote suitably to ccurespond with the designation “uniliteral” in the case of the distribution of the 
single letters. A trigraphio distribute of A B C D E F would consider only the trigraphs ABC and D E F, 
whereas a triliteral distribution would consider the groups A B C, B C D, C D E, and D E F. (See also Par. lid.) 
The use of the word “distribution” to replace the word “table” has already been explted. 
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CONDENSED TABLE OF REPETITIONS 





DigrapAe 




TVigrapkt 


Longer Pt^ygrapihs 


DZ-9 


TZ-5 


VF-4 


DZY-4 


FHT-3 


HTZAITYDZY-2 


ZD-9 


TY-5 


VT-4 


HTZ-4 


TYD-3 


BDVFHTZDF-2 


HT-8 


FH-4 


ZF-4 


ITY-4 


YDZ-3 


ZAITYDZY-3 


ZY-6 


GH-4 


ZT-4 


ZDF-4 


ZAI-3 


FHTZ-3 


DF-5 

GZ-5 


IT-4 


ZZ-4 


AIT-3 


. 





IE 

ZF 

GI 

SZ 

VG 

YZ 

ZO 



DU AX 
ZZ EH 
FH nH 



HV 

ZG 

lY 

ZK 

lY 

HZ 









cz 




ZF 


PD 


























VY 




TE 










ZT 




VS 


TU 


GT 
























HC 




AD 










ZZ 




DK 


ZH 


GX 
























DH 




ST 










ZF 




VH 


DZ 


GU 
























HZ 




AF 










BV 




DU 


YA 
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y. The triliteral frequency distribution is now to be examined with a view to ascertaining 
what digraphs and trigraphs occur two or more times in the cryptogram. Consider the pair 
of columns containing the prefixes and sufiSxes to D, in the distribution, as shown in Fig. 14. 
This pair of columns shows that the following digraphs appear in the cryptogram: 



Digraph* hated on prefixet {arranged 
at one reade up the column) 

FD. ZD, ZD, VD, AD, YD, BD, 

ZD, ID, ZD, YD, BD, ZD, ZD, 

ZD, CD, ZD, YD, VD, SD, GD, 

ZD. ID 



Digraph* hated on suffise* (arranged 
at one read* up the column) 

DZ, DY, DS, DP, DZ, DZ, DV, 

DF, DZ, DP, DZ, DV, DP, DZ, 

DT, DZ, DO, DZ, DG, DZ, DI, 

DF, DE 



The nature of the triliteral frequency distribution is such that in findiTig what digraphs are 
present in the cryptogram it is immaterial whether the prefixes or the suffixes to the cipher 
letters are studied, so long as one is consistent in the study. For example, in the foregoing list of 
digraphs based on the prefixes to D,, the digraphs FD, ZD, ZD, VD, etc., are found; if now, the 
student will refer to the sufiSxes of F„ Zg, Vg, etc., he will find the very same digraphs indicated. 
This bring the case, the question may be raised as to what value there is in listing both the 
prefixes and the sufiSxes to the cipher letters. The answer is that by so doing the trigraphs are 
indicated at the same time. For example, in the case of Dg, the following trigraphs are indicated: 



FDZ, ZDY, ZDS, VDF. ADZ, YDZ, BDV, ZDF, IDZ, ZDF, YDZ, BDV, ZDF, 
ZDZ. ZDT, CDZ, ZDO, YDZ, VDG, SDZ, GDI, ZDF, IDE. 



g. The repeated digraphs and trigraphs can now be found quite readily. Thus, in the case 
of Dg, examining the list of digraphs based on sufiSxes, the following repetitions are noted: 

DZ appears 9 times 
DF appears 5 times 
DV appears 2 times 

Examining the trigraphs with Dg as central letter, the following repetitions are noted: 

ZDF appears 4 times 
YDZ appears 3 times 
BDV appears 2 times 

A, It is unnecessary, of course, to go through the detailed procedure set forth in the pro- 
ceding subparagraphs in order to find all the repeated digraphs and trigraphs. The repeated 
trigraphs with Dg as central letter can be found merely from an inspection of the prefixes and 
sufiSxes opposite Dg in the distribution. It is necessary only to find th(»e cases in which two or 
more prefixes are identical at the same time that the suffixes are identical. For example, the 
distribution riiows at once that in four cases the prefix to Dg is Zg at the same time that the 
suffix to this letter is Fg. Hence, the trigraph ZDF appears four times. The repeated trigraphs 
may all be found in this manner. 

i. The most frequently repeated digraphs and trigraphs are then assembled in what is 
termed a condensed table of repetitions, so as to bring tbia information prominently before the ^e. 
As a rule, digraphs which occur less tiian four or five times, and trigraphs which occur less than 
three or four times may be omitted from the condensed table as being relatively of no importance 
in the study of repetitions. In the condensed table the frequencies of the individual letters 
forming the most important digraphs, trigraphs, etc., should be indicated. 

28. Classifying the cipher letters into vowels and consonants. — a. Before proceeding to a 
detailed analysis of the repeated digraphs and trigraphs, a very important step can be taken which 
will be of assistance not only in the analysis of the repetitions but also in the final solution of 
the cryptogram. This step concerns the classification of the high-frequency letters into two 
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groups — ^vowels and consonants. For if the cryptanalyst can quickly ascertain the equivalents 
of the four vowels, A, E, I, and 0, and of only the four consonants, N, R, S, and T, he will thm 
have the values of approximately two-thirds of all the cipher letters that occur in the cryptc^ram; 
the values of the remaining letters can almost be filled in automatically. 

b. The basis for the classification will be found to rest upon a comparatively simple phe- 
nomenon: the assodational or combinatory behavior of voweb is, in general, quite different 
from that of consonants. If an examination be made of Table 7-B in Appendix 1, showing the 
relative order of frequency of the 18 digraphs composing 25 percent of English telegraphic text, 
it will be seen that the letter E enters into the composition of 9 of the 18 digraphs; that is, in 
exactly half of aU the cases the letter E is one of the two letters forming the digraph. The 
digraphs containing E are as follows: 

ED EN ER ES 

NE RE SE TE VE 

The remaining nine digraphs are as follows: 

AN ND OR ST 

IN NT TH 

ON TO 

e. None of the 18 digraphs i» a eombmation of vowels. Note now that of the 9 oomlnnations 
with E, 7 are with the consonants N, R, S, and T, one is with D, one is with V, and none is with any 
vowel. In other words, Ep combines most readily with consonants but not with other vowels, or 
even with itself. Using the terms often employed in the chemical analogy, E shows a great 
“afiBnity” for the consonants N, R, S, T, but not for the vowels. Therefore, if tiie letters of highest 
frequency occurring in a g^ven cryptogram are listed, together with the number of times each of 
them combines with the dpher equivalent of Ep, those which show considerable combining power 
or affinity for the cipher equivalent of Ep may be assumed to be the cipher equivalents of N, R, S, 
Tp; those which do not show any affinity for the cipher equivalent of Ep may be assumed to be the 
cipher equivalents of A, I, 0, Up. Applying these principles to the problem in hand, and examin- 
ing the triliteral frequency distribution, it is quite certain that Z,=Ep, not only because Z, is the 
letter of highest frequency, but also because it combines with several other high-frequency letters, 
such as De, F„ Ge, etc. The nine letters of next highest frequency are: 

as 32 19 10 16 ts 14 10 10 

DTFGVHYSI 

Let the combinations these letters form with Z, be indicated in the following manner: 

Number of times Z« occurs as prefix., g ^ ^ g 

Cipher Letter. D(23) T(22) F(19) G(19) V(16) H(15) Y(14) S(10) 1(10) 

s s ^ s ^ 

Number of times Z« occurs as suffix.. ^ 

d. Consider D«. It occurs 23 times in the message and 18 of those times it is combined with 
Z„ 9 times in the form ZJ3, (=E0p), and 9 times in the form D,Za (=OEp). It is clear that D, 
must be a consonant. In the same way, consider T„ which shows 9 combinations with Z«, 4 in the 
form Z,Ta (— E9p) and 5 in the form TaZ« (=0Ep). The letter Ta appears to represent a consonant, 
as do also the letters Fa, G„ and Ya. On the other hand, consider Va, occurring in all 16 times but 
never in combination with Z.; it appears to represent a vowel, as do also the letters Ha, Sa, and I,. 
So far, then, the following classificatioh would seem logical: 

VoweU ConaonatUa 

Za(=Ep), V„ H„ Sa, I. Da, Ta, Fa, Ga, Y. 
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S9. Farther analysis of the letters representing vowels and consonants. — a. Op is usually 
the vowel of second behest frequency. Is it possible to determine wbicb of tbe letters V, H, S, Ig 
is tbe mpber equivalent of Op? Let reference be made again to Table 6 in Appendix 1, where it 
is seen that tbe 10 most frequently occurring diphthongs are: 

Diphthong. .10 OU EA El AI IE AU EO AY UE 

Frequency 41 37 35 27 17 13 13 12 12 11 

If V, H, S, Ig are really tbe cipher equivalents of A, 1, 0, Up (not respectively), perhaps it is possible 
to determine which is which by emmining the eomhinationa they make among themeelves and with 
Zg (=Ep). Let the combinations of V, H, S, I, and Z that occur in the message be listed. There 
are only the following: 

ZZg— 4 HI— 1 

VH —2 SV— 1 
HH —1 IS— 1 

ZZg is of course EEp. Note the doublet HHg,' if Hg is a vowel, then the chances are excellent that 
H,=(^ because the doublets AAp, Up, UUp, are practically non-existent, whereas the double vowel 
combination OOp is of next highest frequency to the double vowel combination EEp. If Hg=0p, 
then Vg must be Ip because the digraph VHg occmring two times in the message could hardly be 
AOp, or UOp, whereas the diphthong lOp is the one of high frequency in Elnglish. So far then, the 
tentative (because so far unverified) results of the analysis are as follows: 

Z,=Ei, H,=0p Vg=Ip 

This leaves only two letters, and S« (already classified as rowels) to be separated into Ap and 
Up. Note tbe digraphs: 

, HIg=O0p 

SV,=0Ip 

is,=eep 

Only two alternatives ore op^: 

(1) Esther I,=Ap and Sg=Up, 

(2) Or I,=UpandS,=A„. 

If the first alternative is selected, then 

HIg=0Ap 
SVg = UIp 
ISg=AUp 

If the second alternative is selected, then 

HIg = 0Up 
SVg = AIp 
ISg=UAp 

The eye finds it difficult to choose between these alternatives; but suppose l^e frequency values of 
the plqin-text diphthongs as given in Table 6 of Appendix 1 are added for each of these alternatives, 
giving the following: 

'HIg=0Ap, frequency value = 7 
SV,=UIp, frequency value= 6 
ISg=AUp, frequency value=13 

Total— --I-- 25 



HIg=0Up, frequency value=37 
SV,=AIp, frequency value=17 
ISg=UAp, frequency value= 5 

TotaL.. 59 
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Mathematically, the second IdtCmatiye is more than twice as probable as the first. Let it be 
assumed to be correct and the following (still tentative) values are now at hand: 

Z,=Ep H,=Op Ve=Ip Se=Ap I,=Up 

b. Attention is now directed to the letters classified as consonants. How far is it possible 
to ascertain their values? The letter D„ from considerations of frequency alone, would seem 
to be Tp, but its frequency, 23, is not considerably greater than that for T.. It is not much 
greater than that for F« or G„ with a frequency of 19 each. But perhaps it is possible to ascer- 
tain not the value of one letter alone but of two letters at one stroke. To do this one may make 
use of a tetragraph of considerable importance in English, viz, TKHfp. For if the analysis per- 
taining to the voweb is correct, and if VHg=aI0p, then an examination of the letters immediately 
before and after the digraph VH, in the cipher text nught disclose both Tp and Np. Reference 
to the text gives the follolidng: 

GVHT, FVHT, 

eiO0p eiOGp 

The letter T, follows VH, in both cases and very probably indicates that T,=Np; but as to whether 
Gg or Fg equals Tp cannot be decided. However, two conclusions are clear: first, the letter D, 
is neither Tp nor Np, from which it follows that it must be either Rp or Sp; second, the letters 
G, and F, must be either Tp and Sp, respectively, or Sp and reSpectivdy, because the only 
tetragraphs usually found (in English) containiog the diphthong lOp as central letters are SIONp 
and TIONp. This in turn means that as regards D,, the latter cannot be eifAer Rp or Sp; it must 
be Rp, a conclusion which is corroborated by the fact that ZD, (=ERp) and DZ, (=REp) occur 9 
times each. Thus far, then, the identifications, when inserted in an enciphering alphabet, are 
as follows: 

Plain A BCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher JS Z V TH DGFI 

F G 

30. Substituting deduced values in the cryptogram. — a. Thus far the analysis has been 
almost purely hypothetical, for as yet not a single one of the values deduced fropi the foregoing 
analysis has been tried out in the cryptogram. It is high time that this be done, because the 
final test of the validity of the hypotheses, assumptions, and identifications made in any crypto- 
graphic study is, after all, only this: do these hypotheses, assumptions, and iden^cationa 
ultimately yield verifiable, intelligible plain-text when consistently applied to the cipher text? 

b. At the present stage in the process, since there are at hand the assumed values of but 9 
out of the 25 letters that appear, it is obvious that a continuous “reading” of the cryptogram 
can certainly not be expected from a mere insertion of the values of the 9 letters. However, the 
substitution of these valuea should do two things. First, it should immediately disclose the 
fragments, outlines, or “skeletons” of “good” words in the text; and second, it should disclose 
no places in the text where “impossible” sequences of letters are established. By the first is 
meant that the partially deciphered text should show the outlines or skeletons of words such 
as may be expected to be found in the communication; this wiU become quite clear in the next 
subparagraph. By the second is meant that sequences, such as “AOOEN” or “TNRSENO” or the 
like, obviously not possible or extremely unusual in normal English text, must bot result from 
the substitution of the tentative identifications resulting from the analysis. The appearance 
of several such extremely unusual or impossible sequences at Qn^ 'tignifies that one or pipre of 
the assumed values is incorrect. 
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e. Here are the results of substituting the nine values which have been deduced by the 
reasoning based on a classification of the high-frequency letters into vowels and consonants 
and the study of the members of the two groups: 



1 a 3 4 3 e 7 8 «10 11iaiS14UU17 1gUa031 32 33 34 3S 
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d. No imposfflble sequences are brought to light, and, moreover, several long words, nearly 
complete, stand out in the text. Note the following portions: 

A31 

HBZDSGVHTF 

(1) 0 ? E R A S. I 0 N T 

T S 

ou 

TVTZBDVFHTZDF 

(2) NINE?RITONERT 

S S 

na 

SLGZDTHHT 

(3) A7SERN00N 

T 

The words are obviously OPERATIONS, NINE PRISONERS, and AFTERNOON. The value G, 
clearly Tpj that of F, is S,; and the following additional v^ues are certain: 

B*=Pp Iie=Fp 

31. Completing the solution. — a. Each time an additional value is obtained, substitution 
is at once made throughout the ci 3 q>togram. This leads to the determination of further values, 
in an ever-widening circle, until all the identifications are firmly and finally established, and the 
message is completdy solved. In this case the decipherm^t is as follows: 

X 3 S 4 t 8 7 8810 11 13 1114 1818 17 18 10 303123283(38 

SFDZFIOGHLPZFGZDYSPFHBZDS 
^ ASRESULTOFYESTERDAYSOPERA 

GVHTFUPLVDFGYVJVFVHTGADZZ 
" TIONSBYFIRSTDIVISIONTHREE 

^ AITYDZYFZJZTGPTVTZBDVFHTZ 
^ HUNDREDSEVENTYNINEPRISONE 

^ DFXSBGIDZYVTXOIYVTEFVMGZZ 
^ RSCAPTUREDINCLUDINGSIXTEE 

THLLVXZDFMHTZAITYDZYBDVFH 
^ NOFFICERSXONEHUNDREDPRISO 

„ TZDFKZDZZJSXISGZYGAVFSLGZ 
^ NERSWEREEVACUATEDTHISAFTE 

^ DTHHTCDZRSVTYZDOZFFHTZAIT 
^ RNOONQREHAINDERLESSONEHUN 

„ YDZYGAVDGZZTKHITYZYSDZGHU 
^ DREDTHIRTEENWOUNDEDARETOB 

ZFZTGUPGDIXWGHXASRUZDFUID 
•' ESENTBYTRUCKTOCHAMBERSBUR 

EGHTVEAGXX 
^ GTONIGHTXX 
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Message: AS RESULT OF YESTERDAYS OPERATIONS BY FIRST DIVISION THREE 
HUNDRED SEVENTY NINE PRISONERS CAPTURED INCLUDING SIXTEEN OFFICERS ONE 
HUNDRED PRISONERS WERE EVACUATED THIS AFTERNOON REMAINDER LESS ONE HUNDRED 
THIRTEEN WOUNDED ARE TO BE SENT BY TRUCK TO CHAMBERSBURG TONIGHT 

b. The solution should, as a rule, not be considered complete until an attempt has been 
made to discover all the elements underlying the general system and the specific key to a message. 
In this case, there is no need to delve further into the general system, for it is merely one of 
monoalphabetic substitution with a mixed cipher alphabet. It is necessary or advisable, how- 
ever, to reconstruct the cipher alphabet because this may give clues that later may become 
valuable. 

e. Cipher alphabets should, as a rule, be reconstructed by the cryptanalyst in the form of 
enciphering alphabets because they will then usually be in the form in which the encipherer 
used them. This is important for two reasons. First, if the sequence in the cipher component 
gives evidence of system in its construction or if it yields clues pointing toward its derivation 
from a keyword or a key-phrase, this may often corroborate the identifications already made 
and may lead directly to additional identifications. A word or two of explanation is advisable 
here. For example, refer to the skeletonized enciphering alphabet given at the end of par. 296: 



Plain.... ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher. S Z V TH DGFI 

F G 



Suppose the cryptanalyst, looking at the sequence DGFI or DFGI in the cipher component, sus- 
pects the presence of a keyword-mixed alphabet. Then DFGl is certai^y a more plausible 
sequence than DGFI. Again, noting the sequence S . . . Z . . . V , . . ^ TH . . D, he might 
have an idea that the keyword be^s after the Z and that the TH is followed by AB or BC. This 
would mean that either P, Qp=A, B« or B, C,. Assuming that P, Qp=A, B«, he refers to the fre- 
quency distribution and finds that the assumptions Pp=A, and Qp=Bg are not good; on the other 
hand, assuming that P, Qp=B, C„ the frequency distribution gives excellent corroboration. 
A trial of these values would materially hasten solution because it is often the case in crypt- 
analysis that if the value of a very low-frequency letter can be surely established it will yield 
dues to other values very quickly. Thus, if Qp is definitely identified it almost invariably will 
identify Up, and will give dues to the letter following the Up, since it must be a vowd. lb the 
case under discussion the identification PQp=BC, would have turned out to be correct. For the 
foregoing reason an attempt should always be made in the early stages of the analysis to deter- 
mine, if possible, the basis of construction or derivation of the cipher alphabet; as a rule this 
can be done only by means of the endphering alphabet, and not the dedpheiing alphabet. For 
example, the skdetonized deciphering alphabet corresponding to the endphering alphabet 
directly above is as follows: 



Cipher. ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Plain ...... RTSOU ANI E 

S T 



Here no evidences of a keyword-mixed alphabet are seen at all. However, if the enciphering 
alphabet has been examined and shows no evidences of systematic construction, the deciphering 
alphabet should then be examined with this in view, because occasionally it is the deciphering 
alphabet which shows the presence of a key or keying dement, or which has been systematically 
derived from a word or phrase. The second reason why it is important to try to discover the basis 
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of construetion or deiivation of tho cipher alphabet is that it affords clues to the general type of 
keywords or keying elements eihployed by the enetny. This is a psychological factor, of course, 
and may be of assistance in subsequent studies of his traffic. It merdy gives a clue to the general 
type of tbinTring indulged in by certain of his cryptographers. 

d. In the case of the foregoing solution, the complete enciphering alphabet is found to be as 
follows: 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher. SUXYZLEAVNWORTHBCDFGIJKMP 

Obviously, the letter Q, which is the only letter not appearing in the cryptogram, should follow 
P in the cipher component. Note now tl^t the latter is based upon the keyword LEAVENWORTH, 
and that this particular cipher alphabet has been composed by shifting the mixed sequence based 
upon this keyword five intervals to the right so that the key for the message is Ap=S«. Note 
also that the deciphering alphabet fails to give any evidence of ke 3 rword construction based upon 
the word LEAVENWORTH. 

Cipher. ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Plain HPQRGSTOUVWFXJLYZMANBIKCDE 

e. If neither the enciphering or the deciphering alphabet exhibits characteristics which 
give indication of derivation from a keyword by some form of mixing or disarrangement, the 
latter is nevertheless not finally excluded as a possibility. The student is referred to Section IX 
of Mementary Military Oryptoffraphy, wherem will be found methods for deriving mixed alphabets 
by transposition methods applied to keyword-mixed alphabets. For the reconstruction of such 
mixed alphabets the cryptanalyst must use ingenuity and a knowledge of the more cornmon 
methods of suppressu^ the appearance of keywords in the mixed alphabets. 

82. General notes on the foregoing solution. — a. The example solved above is admittedly 
a more or less artificial illustration of the steps in analysis, made so in order to demonstrate 
general principles. It was easy to solve because the frequencies of the various cipher letters cor- 
responded quite well with the normal or expected frequencies. However, all cryptograms of 
the same monoalphabetical nature can be solved along the same general lines, after more or less 
experimentation, depending upon the length of the cryptogram, the skill, and the experience of 
the cryptand.yst. 

b. It is no cause for discouragement if the student’s initial attempts to solve a cryptogram of 
this type require much more time and effort than were apparently required in solving the fore- 
going purely illustrative example. It is indeed rarely the case that every assumption made by the 
cryptanalyst proves in the end to have been correct; more often is it the case that a good many 
of his initial assumptions are incorrect, and that he loses much time in casting out the erroneous 
ones. The speed and facility with which this elimination process is conducted is in many cases 
all that distinguishes the expert from the novice. 

e. Nor will the student always find that the initial classification into vowels and cons<mants 
can be accomplished as easily and quickly as was apparently the case in the illustrative example. 
The principles indicated are very general in their nature and applicability, and there are, in 
addition, some other prindples that may be brought to bear in case of difficulty. Of these, per- 
haps the most useful are the following: 

(1 ) In normal English it is unusual to find two or three consonants in succession, each of high 
frequency. If in a cryptogram a succession of three or four letters of high-frequency appear in 
succession, it is practically certain that at least one of these represents a vowel.’ 




’ Sequences of seven oonsonante are not impossible, however, as in STW BMGrW THRO UGH . 
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(2) Successions of three vowels are rather unusual in English * Practically the only time 
this happens is when a word ends in. two vowels and the next word begins with a vowel.* 

(3) Y^en two letters already dasdfied as vowel-equivalents are separated by a sequence of 
six or more letters, it is either the case that one of the supposed vowel-equivalents is incorrect, 
or else that one or more of the intermediate letters is a vowel-equivalent.* 

(4) Reference to Table 7-B of Appendix 1 discloses the following: 

Diitribtaion of fir^ 18 digrajAi forming tB percent of Englieh text 

Number of oonsonaat-oonBonant dlgrapha 4 

Number of eonaonant-vo^ digraphs 6 

Number of vowel-consonant dignpha 8 

Number of vowd-vowd dignpha 0 

Dietribution of firet BS digraphs forming BO percent English text 

Number of consonant-consonant dignpha. 8 

Number of oonaonant-vowd dignpha 23 

Number of vowel-conaonant dignpha 18 

Number of vowel-vowel dignpha 4 

The latter tabulation diows that of the first 53 digraphs which form 50 percent of English text, 
41 of them, that is, over 75 percent, are combinations of a vowel with a consonant. In short, 
in normal English the vowels and the high-frequency consonants are in the long run dis- 
tributed fairly evenly and regularly throughout the text. 

(5) As a rule, repetitions of trigraphs in the cipher text are composed of high-frequency 
letters forming hi^-frequency combinations. The latter practically always contain at least one 
vowel; in fact, if reference is made to Table 10-A of Appendix 1, it will be noted that 36 of the 56 
trigraphs having a frequenr^^ of 100 or more contain one vowel, 17 of them contain two vowels, 
and only three of them contain no vowel. In the case of tetragraph repetitions. Table 11-A of 
Appendix 1 shows that no tetragraph listed therein fails to contain at least one vowel; 28 of them 
contain one vowel, 25 contaia two vowels, and 2 contain three vowels. 

(6) Quite frequently when two known vowel-equivalents are separated by ^ or more letters 
none of which seems to be of sufficiently high frequency to represent one of the vowels A E I 0, 
the chances are good that the cipher-equivalent of the vowel U or Y is present. 

(7) The letter Q is invariably followed by U; the letters J and V are invariably followed by a 
vowel. 

d. In the foregoing example the amount of experimentation or “cutting and fitting” was 
practically nil. (This is not true of real cases as a rule.) Where such experimentation is neces- 

* Note that the word RADIOED, past tense of the verb RADIO, is coming into usage. 

* A sequence of seven vowels is not impossible, however, as in THE WAY YOU EAR N. 

* Some cryptanalysts place a good deal of emphasis upon this principle as a method of locating the remaining 
vowels after the first two or three have been located. They recommend that the latter be underlined throughout 
the text and then all sequences of five or more letters showing no underlines be studied attentively. Certain 
letters which occur in several such sequences are sure to be vowels. An arithmetical aid in the study is as follows: 
Take a letter thought to be a good possibility aS the cipher equivalent of a vowel Qiereafter termed a possible 
vowd-eguivaleni) and find the length of each interval from the possible vowel-equivalent to the next known (fairly 
surely determined) vowel-equivalent. Multiply the intervsl by the number of times this interval is found. Add 
the products and divide by the total number of intervals considered. This will give the mean interval for that 
possible vowel-equivalent. Do the same for all the other possible vowel-equivalents. The one for which the 
mean is the greatest is most probably a vowel-equivalent. Underline this letter throu^oqt the text and repeat 
the process for locating additionstl vowel-equivalwts, if any remain to be located. 



\ 
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sary, the imderBCoring of all repetitions of several lettos is vetry essential, as it calls attention to 
peculiarities of structure that often yield clues. 

e. After a few basic assumptions of values have been made, if short words or skeletons of 
words do not become manifest, it is necessary to make furtiier assumptions for unidentified letters. 
This is accomplished most often by assuming a word.^ Now there are two places in every message 
which lend themselves more readily to successful attack by the assumption of words than do 
any other places — ^the very b^;inning and the very end of the message. The reason is quite 
obvious, for although words may b^in or end with almost any letter of the alphabet, they 
usually b^in and end with hut a few very common digraphs and trigraphs. Very often the 
association of letters in peculiar combinations will enable the student to note where one word 
ends and the next b^ins. For example suppose, E, N, S, and T have been definitely identified, 
and a sequence like the following is found in a cryptogram: 

. . .ENTSNE. . . 

Obviously the break between two words should fall either after the S of E N T S or after the T 
of E N T, so that two possibilities are offered: . . . ENTS/NE . . ., or . . . ENT/SNE 
. . .. ^ce in English there are very few words with the initial trigraph S N E, it is most 
likely that the proper division is. ..ENTS/NE.... Obviously, when several word 
divisions have been found, the solution is more readily achieved because of the greater ease with 
which assumptions of additional new values may be made. 

83. The “probable word” method ; its value and applicability. — a. In practically all cryptan- 
alytic studies, short-cuts can often be made by assuming the presence of certain words in the 
message under study. Some writers attach so much value to this kind of an “attack from the 
rear” that they practically elevate it to the position of a method and call it the “intuitive method” 
or the “probable-word method.” It is, of course, merdy a refinement of what in evety-day 
language is called “assuming” or “guessing” a word in &e message. The value of making a 
“good guess” can hardly be overestimated, and the cryptanalyst should never feel that he is 
accomplishing a solution by an ill^timate subterfuge when he has made a fortimate guess 
leading to solution. A correct assumption as to plain text will often save hours or days of labor, 
and sometimes there is no alternative but to try to “guess a word”, for occasionally a system is 
encountered the solution of which is absolutely dependent upon this artifice. 

b. The expression “good guess” is used advisedly. For it is “good” in two respects. First, 
the cryptanalyst must use care in Tnalriug his assiunptions as to plain-text words. In this he 
must be guided by extraneous circumstances leading to the assumption of probable words — ^not 
just any words that come to his mind. Therefore he must use his imagination but he must 
nevertheless carefully control it by the exercise of good judgment. Second, only if the “guess” 
is correct and leads to solution, or at least puts him on the road to solution, is it a good guess. 
But, while realizing the usefulness and the time and labor-saving features of a solution by assum- 
ing a probable word, the cryptanalyst should exercise discretion in regard to how long he may 
continue in his efforts with this method. Sometimes he may actually waste time by adhering 
to the method too long, if straightforward, methodical analyris will yield results more quicUy. 

e. Obviously, the “probable-word” method has much more applicability when working 
upon material the general nature of which is known, than when working upon more or less 
isolated communications exchanged between correspondents concerning whom or whose activities 

'' This process does not involve anything more mysterious than ordinary, logical reasoning; there is nothing 
of the subnormal or supernormal about It. If oryptanalytic success seems to require processes akin to those of 
medieval magic, if "hocus-pocus” is much to the fore, the student should begin to look for items that the claimant 
of such success has carefully hidden from view, for the mystification of the uninitiated. (See Par. 33 in this 
connection.) 
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nothing is known. For in the latter case there is little or nothing that the imagination can seize 
upon as a background dr basis for the assumptions.' 

d. Very frequently, the choice of probable words is aided or limited by the number and 
positions of repeated letters. These repetitions may be ‘paieid — that is, externally visible in 
the cryptographic text as it originally stands — or they may be laJt&nt — that is, externally invisible 
but susceptible of being made patent as a result of the analysis. For example, in a monoalpha- 
betic substitution cipher, such as that discussed in the preceding paragraph, the repeated letters 
are directly exhibited in the cryptogram; later the student will encounter many cases in which 
the repetitions are latent, but are made patent by the analytical process. When the repetitions 
are patent, then the paMem or formula to which the repeated letters conform is of direct use 
in assuming plain-text words; and when the text is in word-lengths, the pattern is obviously of 
even greater assistance. Suppose the cryptanalyst is dealing with military text, in which case 
he may expect such words as DIVISION, BATTALION, etc., to be present in the text. The 
positions of the repeated letter I in DIVISION, of the reversible digraph AT, TA in BATTALION, 
and SO on, constitute for the experienced cryptanalyst tell-tale indications of the presence of 
these words, even when the text is not divided up into its original word lengths. 

a. The important aid that a study of word patterns can afford in cryptanalysis warrants the 
use of definite terminolc^ and the establishment of certain data having a bearing thereon. The 
phenomenon herein under discussion, namely, that many words are of such construction as 
regards the number and positions of repeated letters as to make them readily identifiable, wiU be 
tecxaedidiomorphism (from the Greek “idios"= one’s own, individual, pecuUar+“niorphe”=form). 
Words which show this phenomenon wiU be termed idiomorphic. It wUl be useful to deal with 
the idiomorphisms symboUcaJly and systematically as described below. 

y. When dealing with cryptograms in which the word lengths are determined or specifically 
shown, it is convenient to indicate their lengths and their repeated letters in some easUy recog- 
nized maimer or by formulas. This is exemplified, in the case of the word DIVISION, by the 
formula ABCBDBEF; in the case of the word BATTALION, by the formula ABCCBDEFG. If the 
cryptanalyst, during the course of his studies, makes note of strikmg fonnulas he has encoun- 
tered, with the words which fit them, after some time he wiU have assembled a quite valuable 
body of data. And after more or less complete lists of such formulas have been established in 
some systematic arrangement, a rapid comparison of the idiomorphs in a specific cryptogram 
with those in bis lists wiU be feasible and wiU often lead to the assumption of the correct word. 
Such lists can be arranged according to word length, as shown herewith: 

3/aba : DID, EVE, EYE. 
abb : ADD, ALL, ILL, OFF, etc. 

4/abao : ARAB, AWAY, etc. 

abca : AREA, BOUB, DEAD, etc. 

abbc : . . . 

abcb : . . . 

etc. etc. 

* General Givierge in his Cowa da Cryptographie (p. 121) says: "However, expert cryptanalysts often 
employ such details as are cited above [in connection with assuming the presence of ‘probable words’], and the 
experience of the years 1914 to 1918, to cite only those, prove that in practice one often has at his disposal ele- 
ments of this nature, permitting assumptions much more audacious than those which served for the analysis 
oi the last example. The reader would therefore be wrong in imagining that such fortuitous elements are 
encountered only in cryptographic works where the author deciphers a document that he himself enciphered. 
Cryptographic correspondence, if it is extensive, and if suffidentiy numerous working data are at hand, often 
furnishes elements so complete that an author would not dare use all of them in solving a problem for fear of 
being accused oS obvious exaggeration." 
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g. When dealing with cryptographic text in which the l^igths of the worde are not indicated 
or otherwise determinable, lists of the foregoing nature are not so useful as lists in which the 
words (or parts of words) are arranged according to the intervals between identical letters, in the 
following manner: 



1 Interval 
-DiD- 
-EvE- 
-EyE- 
dlvlslon 
revision 
etc. 



2 Intervals 
AbbAcy 
ArAbiA 
AblAtive 
AboArd 
-AciA- 
etc. 



3 Intervals 
AbeyAnce 
hAbitAble 
lAborAtory 
AbreAst 
AbroAd 
etc. 



Repeated digraphs 

cocoa 

dERER 

ICICle 

ININg 

bAGgAGe 

etc. 



34. Solution of additional cryptograms produced by the same cipher component. — a. To 
return, after a rather long digression, to the cryptogram solved in pars. 28-31, once the cipher 
component of a cipher alphabet has been reconstructed, subsequent messages which have been 
enciphered by means of the same cipher component may be solved very readily, and without 
recourse to the principles of frequency, or application of the probable-word method. It has been 
seen that the illustrative (nyptogram treated in paragraphs 24-31 was enciphered by juxtaposing 
the cipher component against the normal sequence so that Ap=S«. It is obvious t^t the cipher 
component may be set against the plain component at any one of 26 different points of coinci- 
dence, each yielding a different cipher alphabet. After a cipher component has been reconstructed, 
however, it becomes a knovm sequence, and the method of converting the cipher letters into their 
plain-component equivalents and then completing the plain-component sequence begun by 
each equivalent can be applied to solve any cryptogram which has been endphered by that 
cipher component. 

b. An example will serve to make the process dear. Suppose the following message, passing 
between the same two stations as before, was intercepted shortly after the first message had 
been solved: 



IVEWK CERNW OFOSE LFOQH EAZXX 

It is assumed that the same dpher component was used, but with a different key letter. First 
the initial two groups are converted into their plain-component equivalents by setting the 
cipher component against the normal sequence at any arbitrary point of coinddence. The 
initial letter of the former may as well be set against A of the latter, with the following result: 



Plain ABCDEFGHIJKLMMOPQRSTUVWXYZ 

Cipher. LEAVNWORTHBCDFGIJKMPQSUXYZ 



Cryptogram lYEWK CERNW ... 

Equivalents PYBFR LBHEF ... 

The normal sequence initiated by each of these conversion equivalents is now completed, with 
the results shown in Fig. 15. Note the plain-text generatrix, CLOSEYOURS, which manifests 
itself without further analysis. The rest of the message may be read dther by continuing the 
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same process, or, what is even more simple, the key letter of the message may now be determined 
quite readily and the message deciphered by its means. 

lYEWKCERNff 

PYBFRLBHEF 

QZCGSMCIF6 

RADHTNDJGH 

SBEIUOEKHI 

TCFJVPFLIJ 

UDGKVQGUJK 

VEHLXRHNKL 

WFIMYSIOLU 

XGJNZTJPMN 

YHKOAUKQNO 

ZILPBVLROP 

AJMQCWHSPQ 

BKNRDXNTQR 

•CLOSEYOURS 

DMPTFZPVST 

ENQUGAQWTU 

FORYHBRXUV 

GPSWICSYVW 

HQTXJDTZWX 

IRUYKEUAXY 

JSVZLFVBYZ 

KTWAHGWCZA 

LUXBNHXDAB 

HVYCOIYEBC 

NWZDPJZFCD 

OXAEQKAGDE 

e. In order that the student may understand without question just what is inyolved in the 
latter step, that is, discovering the key letter after the first two or three groups have been deci- 
phered by the conversion-completion process, the forgoing example wiU be used. It was noted 
that the ^t cipher group was finally deciphered as follows: 

Cipher I Y E W K 

Plain CLOSE 

Now set the cipher component against the normal sequence so that Cp= I,. Thus: 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

apher FGIJKMPQSUXYZLEAVNWORTHBCD 

It is seen here that when Cp=Ig then Ap=F,. This is the key for the entire message. The 
decipherment may be completed by direct reference to the foregoing cipher alphabet. Thus: 

Cipher. lYEWK CERNW OFOSE LFOOH EAZXX 

Plain CLOSE YOURS TATIO NATTW OPMXX 

Message: CLOSE YOUR STATION AT TWO PH 

d. The student ^ould make sure that he imderstands the fimdamental principles involved in 
this quick solution, for they are among the most important principles in cryptanaly tics. How use- 
ful they are will become dear as he progresses into more and more complex cryptanaly tic studies. 
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Section VII 

MULTIUTERAL SUBSTITUTION WITH SINGLE-EQUIVALENT CIPHER ALPHABETS 

Pmingh 



Aoalyaia of multiliteral, monoalphabetlc aubatitution ayatema 36 

Hiatorioally inteieating examplea 36 



36. AnalysiR of mnltiUteral, monoalphabetio substitution systems. — a. Substitution methods 
in general may be classified into uniliteral and multiliteral systems.* In the former there is a 
strict "one-to-one” correspondence between the length of the unite of the plain and those of the 
cipher text; that is, each letter of the plain text is replaced by a single character in the cipher text. 
In the latter this correspondence is no longer Ipilg but may be lp:2„ where each letter of the plain 
text is replaced by a combination of two characters in the cipher text; or Ip: 3„ where a 3 -character 
combination in the cipher text represents a single letter of tiie plain text, and so on. A cipher in 
which the correspondence of the lp:l, t3rpe is termed uniliteral in character; one in which it is of 
the lp:2, type, biliteral; lp:3e, triliteral, and so on. Those beyond the Ip:!, type are classed to- 
gether as tnuUUiieral. 

b. When a multiliteral system employs biliteral equivalents, the cipher alphabet is said to be 
bipartite. Such alphabets are composed of a set of 25 or 26 combinations of a limited number of 
characters taken in pairs. An example of such an alphabet is the foUo^nng. 



Plain 


A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


u 


Cipher 


WW 


WH 


WI 


WT 


WE 


HW 


HH 


HI 


HT 


HT 


HE 


IW 


IH 


Plain 


N 


0 


P 


Q 


R 


S 


T 


U 


V 


W 


X 


y 


Z 


Cipher 


II 


IT 


IE 


TW 


TH 


TI 


TT 


TE 


EW 


EH 


El 


ET 


EE 



This alphabet is derived from the square diown in 15. 

( 2 ) 





W 


H 


I 


T 


E 


W 


A 


B 


C 


D 


E 


H 


F 


G 


H 


1-J 


K 


(1) I 


L 


M 


N 


0 


P 


T 


Q 


R 


S 


T 


U 


E 


V 


W 


X 


Y 


Z 



TioiniiU. 



c. If a message is enciphered by means of the foregone bipartite alphabet the cryptogram is 
still monoalphabetlc in character. A frequency distribution based upon pairs of letters will 

‘ See Sec. VII, Advanced Military Cryptography. 



( 69 ) 
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obviously have all the characteristics of a rimple, uniliteral distribution for a monoalphabetic 
substitution cipher. 

d. Ciphers of this type, as well as of those of the multiliteral (triliteral, quadraliteral, . . .) 
type are readily detected externally by virtue of the fact that the cryptographic text is composed 
of but a very limited number of different characters. They are handled in exactly the same man- 
ner as are uniliteral, monoalphabetic substitution ciphers. So long as the same character, or 
combination of characters, is always used to represent the same plain-text letter, and so long as a 
given letter of the plain text is dways represented by the same character or combination of 
characters, the substitution is strictly monoalphabetic and can be handled in the simple manner 
described under Par. 31 of this text. 

e. An interesting example in which the cipher equivalents are quinqueliteral groups and yet 
the resulting cipher is strictly monoalphabetic in character is found in the cipher STstem invented 
by Sir Francis Bacon over 300 years ago. Despite its antiquity the system possesses certain 
features of merit which are well worth noting. Bacon* proposed the following cipher alphabet, 
composed of permutations of two elements taken five at a time: * 



A=aaaaa 


I-J=abaaa 


R=baaaa 


B=aa£iab 


&sabaab 


Ssbaaab 


Csajaaba 


Lsababa 


T=baaba 


D=aaabb 


lt=ababb 


U-V=baabb 


E=aabaa 


Nsabbaa 


W=babaa 


F=aabab 


0>=abbab 


£=babab 


(}=aabba 


Psabbba 


Y=babba 


H=aabbb 


(^bbbb 


Zsbabbb 



If this were all there were to Bacon’s invention it would he hardly worth bringing to attention. 
But what he pointed out, with great clarity and rimple examples, was how such an alphabet 
might be used to convey a secret message by enfolding it in an innocent, external message which 
might easily evade the strictest kind of censorship. As a very crude example, suppose that a 
message is written in capital and lower case letters, any capital letter standing for an “a” dement 
of the cipher alphabet, and any small letter, for a “b” element. Then the external sentence 
"All is well with me today” can be made to contain the secret message "Help." Thus: 

ALl is WElL WItH mE TodaY 

a a b b b a a b a a aba b a a b b b a 

H E L P 

Instead of employing such an obvious device as capital and small letters, suppose that an “a" 
element be indicated by a very slight shading, or a vdy slightly heavier stroke. Then a secret 
message might easily be thus enfolded within an external message of exactly opposite meaning. 
The number of possible variations of this basic scheme is very high. The fact that the characters 

* For a true picture of this cipher, the explanation of which is often distorted beyond recognition even by cryp- 
to graphera, see Bacon’s own description of it as contained in his Ds AugmeniM Seimtiarum (.The Adeaneement cf 
Learning), as translated by any first-class editor, su(fii as Gilbert Watts (1640) or EUis, Spedding, and Heath 
(1867, 1870). The student is cautioned, however, not to accept as true any alleged “decipherments" obtained 
by the application of Bacon’s cipher to literary works of the 16th century. These readings are purely subjective. 

' In the 16th Century, the letters I and J were used interchangeably, as were also U and V. Bacon’s alphabet 
was called by him a “biliteral alphabet" because it employs permutations of two letters. But from thecryptan- 
alytic standpoint the significant point is that each plain-text letter is represented by a 6-character equivalent. 
Hence, present terminology requires that this alphabet be referred to as a quingnditeral alphahet. 
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of Hie ciTptc^aphic text are hidden in some manner or other has, however, no effect upon the 
strict monoalphabeticity of the scheme. 

36. Historically interesting examples. — a. Two examples of historical interest will be cited 
in this connection as illustrations. During the campaign for the presidential election of 1876 
many cipher messages were exchanged between the Tilden managers and their agents in several 
states where the voting was hotly contested. Two years later the New York Tribune * exposed 
many irregularities in the campaign by publishing the decipherments of many of these messages. 
These decipherments were achieved by two investigators employed by the Tribune, and the 
plain text of the messages seems to show that illegal attempts and measures to carry the election 
for Tilden were made by his managers. Here is one of the messages: 

JACKSONVILLE, Nov. 16 (1876). 

GEO. F. RANEY, Tallediassee . 

Ppyyemnsnyyypifflashnsyyssltepaaenshns 
pensshnsmmplyysnppyeaa p i e I s s y e s h a 1 n s s s p 
eeiyyshnynassyepla an yltnsshyyspyyplnsyy 
ssitemeipimmelsselyye Iss 1 te 1 ep.yype e laas s 
imaayespnsyylanssselssmmppnsplnssnplnsim 
imyyitemyyssp eyymmnsyys s i t spyype e p p pma 
a a y y P 1 1 t 
L' Engle goes up tomorrow. 

DANIEL. 

Examination of the message discloses that oioly ten Cerent letters are used. It is probable, 
therefore, that what one has here is a cipher which, employs a bipartite alphabet and in which 
combinations of two letters represent single letters of the plain text. The message is therefore 
rewritten in pairs and substitution of arbitrary letters for the pairs is made, as seen below: 

PP YY EM NS NY YY PI MA SH NS YY SS eto. 

A BCDE BFGHDB I etc. 

A triliteral frequency distribution is then made and analytis of the message along the lines 
illustrated in the preceding section of this text yidds solution, as follows: 

Jacksonville, Nov. 16. 

Geo. F. Banet, Tallahassee: 

Have Marble and Coyle tel^aph for influential men from Delaware and Viiginia. Indi- 
cations of weakening here. Press advantage and watch Board. L’ESngle goes up tomorrow. 

Daniel. 

(. The other example, using numbers, is as follows: 

JACKSONVILLE, Nov. 17. 

S. PASCO and E. M. L'ENGLE: 

84 55 84 25 93 34 82 31 31 75 93 82 77 33 55 42 

93 20 93 66 77 66 33 84 66 31 31 93 20 82 33 66 

52 48 44 55 42 82 48 89 42 93 31 82 66 75 31 93 

DANIEL. 

* New York Tribune, Extra No. 44, The Cipher Diepatchee, New York, 1879. 
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There were, of course, several messages of .like nature, and examination disclosed that 
only 26 different numbers in all were used. Solution of these ciphers followed very easily, the 
decipherment of the one given above being as follows: 

Jacksonville, Nm. 17. 

S. Pasco and E. M. L’Enolb: 

Cocke will be ignored, Eagan called in. Authority reliable. 

Daniel. 



e. The Tribune experts gave the following alphabets as the result of their decipherments: 



AA=0 


EN=Y 


IT=D 


NS«E 


PP=H 


SS=N 


AI=U 


EP»C 


MA=B 


NY=H 


SH=L 


YE=F 


EI=I 




UM=G 


PE=T 


SN=P 


YI=X 


EI^V 


IMsS 


NNi=J 


PI=R 


SPi=W 


YY=A 


20»D 


33«N 


44=H 


62=X 


77=G 


89=Y 


25^ 


34;=W 


48=^ 


6&^ 


82=1 


93=E 


27=^S 


39=P 


52=U 


68=F 


84=C 


96=H 


31aL 


42<=R 


55>0 


75=B 


8T=V 


99=J 



They did not attempt to correlate these alphabets, or at least they say nothing about a possible 
relationship. The present author has, however, reconstructed the rectangle upon which these 
alphabets are based, and it is given below (fig. 16). 



& 

.A 



9 

5z; 

U 

O 

s 

1d 



2d Letter or Number 





H 


I 


s 




A 


Y 


H 


E 


N 


T 




1 


2 


3 


4 


5 


6 


7 


8 


9 


0 


H 1 






















I 2 










K 




S 






D 


S 3 


L 


■ ‘ 


N 


W 










P 




P 4 




R 




H 








T 






A 5 




U 






0 












Y 6 




X 








A 




F 






H 7 










B 




G 








E 8 




I 




C 






V 




Y 




N 9 






E 






H 






J 




T 0 























riouBi lA 



It is amusing to note that the conspirators selected as their key a phrase quite in keeping with 
their attempted illegalities — ^HIS PAYMENT — ^for bribery seems to have played a considerable 
part in that campaign. The blank squares in the diagram probably contained proper names, 
numbers, etc. 
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Section VIII 

MULmiTERAL SUBSTITUTION WITH MULTIPLE-EQUIVALENT CIPHER 



ALPHABETS 

Faninph 

Purpose of providing multiple-equivalent cipher alphabets. ^ — 87 

Solution of a simple example 88 

Solution of more complicated example i 89 

A subterfuge to prevent deeiunposition vt dphw text into oompcment unite 40 



87. Forpose of providing multiple-equivalent cipher alphabets. — a. It has been seen tlmt 
the characteristic frequencies of letters composing normal plain text, the associations they, form 
in combining to form words, and the peculiarities certain of them manifest in such text all afford 
direct clues by means of which ordinary monoalphabetic substitution encipherments of such 
plain text may be more or less speedily solved. This has led to the introduction of simple 
methods for disguising or suppressing the manifestations of monoalphabeticity, so far as possible. 
Basically these methods are multiliteral and they will now be presoated. 

b. Multiliteral substitution may be of two types: (1) That wherein each letter of the plain 
text is represented by one and only one mulriliterfd equivalent. For example, in the Francis 
Bacon cipher described in Par. 36s, the letter is invariably represented by the permutation 
abaab. For this reason this type of system may be more completely described as monoalpha- 
betie, myUUiteral aubatitviion with ain^e-eguivalent cipher alphdbeta. 

(2) That wherein, because of the large numbm* of equivalents made available by the com- 
binations and permutations of a limited number of elements, each letter of the plain text may be 
represented by several mulriliterid equivalents which may be selected at r^dom. For example, 
if 3-letter combinations are employed there are available 26’ or 17,576 equivalents for the 26 
letters of the plain text; th^ may be assigned in equal numbers of different equivalents for the 
26 letters, in which case each letter would be representable by 676 different 3-letter equivalents! 
or they may be assigned on some other basis, for example, proportionately to the relative 
frequencies of plain-text letters. For this reason this type of system may be more completely 
described as monoalphabetie, muMliterdl svistitution with mvUiple-egyiDalent cipher alphabets. 
Some authors term such a system “simple substitution with multiple equivalents”; others term 
it monoalphabetie substvtviion with variamts. For the sake of brevity, the latter designation will 
be employed in this text. 

e. The primary object of monoalphabetic substitution with variants is, as has been men- 
tioned above, to provide several values which may be employed at random in a simple substitution 
of cipher equivalents for the plain-text letters. In this connection, reference is made to Section 
X of E2ementary Military Cryptography, wherein several of the most common metiiods for 
producing and using variants are set forth. 

d. A word or two concerning the underlying theory from the cryptanalytic point of view of 
monoalphabetic substitution with variants, may not be amiss. Whereas in simple or single- 
equivalent, monoalphabetic substitution it is seen that — 

(1) The same letter of the plain text is invariably represented by but one and always the 
same character of the cryptogram, and 
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(2) The same character of the cryptogram invariably represents one and always the same 
letter of the plain text; 

In multiliteral substitution with multiple equivalents (monoalphabetic substitution with 
variants) it is seen that — 

(1) The same letter of the plain text may be represented by one or more different characters 
of the cryptogram, but 

(2) The same character of the cryptogram neverthdess invariably represents one and always 
the same letter of the plain text. 

38 . Solution of a simple example. — a. The following cryptogram has been enciphered by a 
set of four alphabets similar to the following: 

A B CDEFG HI-JK LMNOPQRSTUV WX YZ 
08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 01 02 03 04 05 06 07 

35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 26 27 28 29 30 31 32 33 34 

68 69 70 71 72 73 74 75 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 

87 88 89 90 91 92 93 94 95 96 97 98 99 00 76 77 78 79 80 81 82 83 84 85 86 

The keyword here is TRIP^ In enciphering a message the equivalents are to be selected at 
random from among the four variants for each letter. The steps in solving a message produced 
by such a scheme will now be scrutinized. 

Cbtptoobam 

68321 09022 48057 65111 88648 42036 45235 09144 05764 22684 

00225 57003 97357 14074 82524 40768 51058 93074 92188 47264 

09328 04255 06186 79882 85144 45886 32574 55136 56019 45722 

76844 68350 45219 71649 90528 65106 11886 44044 89669 70553 

18491 06985 48579 33684 50957 70612 09795 29148 56109 08546 

62062 65509 32800 32568 97216 44282 34031 84989 68564 53789 

12530774016849438544113688761656905207105886467472 
22490 09136 62851 24551 35180 14230 50886 44084 06231 12876 

05579 58980 29503 99713 32720 36433 82689 04516 52263 21175 

06445 72255 68951 86957 76095 67215 53049 08567 9730 

b. Assuming that the foregoing remarks had not been made and that the cryptogram has 
just been submitted for solution with no information Concerning it, the first step is to make a 
preliminary study to determine whether the cryptogram involves dphet or code. The crypto- 
gram appears in 5-figure groups, which may indicate either cipher or code. A few remarks will 
be made at this point with reference to the method of determining whether a cryptogram com- 
po^ of figure groups is in code or cipher, using the foregoing example. 

c. Ih the first |dace, if the cryptogram contains an even number of digits, as for example 
494 in the foregoing message, this leaves open the possibility that it may be cipher, composed of 
247 pairs of digits; were the number of digits an exact odd multiple of five, such as 125, 135, etc., 
the possibility that the cryptogram is in code of the 5-figure group type must be considered. Next, 
a prdiminary study is made to see if there are many repetitions, and what their characteristics 

, 1 The letter corresponding to the lowest number in each line of the diagram showing the cipher alphabets 
is a key letter. Thus, in the 1st line 01=T; in the 2d line 26=R; etc. 
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are. If the (nyptogram is code of the 5*figure group t3rpe, then such repetitions as appear should 
genendly be in whole groups of five digits, and they should be visible in. the text just as the mes^ 
sage stands, unless the code message has undei^ne encipherment also. If the cryptogram is in 
cipher, then the repetitions should extend beyond the 5-digit groupings; if they conform to any 
definite groupings at all th^ should for the most part contain even numbers of digits since ea(^ 
letter is probably represented by a pair of digits. If no clues of the foregoing nature are present, 
doubts will be dissolved by making a detailed study of frequences. 

d. A simple 4-part frequency distribution is therefore decided upon. Shall the alphabet be 
assumed to be a 25- or a 26-character one? If the former, then the 2-digit pairs from 01 to 00 
fall into exactly four groups each corresponding to an alphabet, ^ce this is the most co mm on 
scheme of drawing up such alphabets, let it be assumed to be true of the present case. The 
following distributions result from the breaking up of the text into 2-digit pairs. 



01-/// 


26-/// 


51— m 


76— m I 


02 — 


27 — 


52— m 


77—1 


03-//// 


28—/ 


55-111 


78— 


04—/ 


29—/ 


54— 


• 79—1 


05— /«/ 


30-/// 


55-/IH 


8ft-/// 


06—tHll 


31— 


■ 56— m 


81— 


07-111 


32—tfUI 


57— m / 


82-////: 


08^ 


33—/ 


58-// 


83—/ 


09-//// 


34—/ 


59— 


84— M/ 


10-//// 


35 — H 


60— 


85— mi 


11— m 


36— m 


61— 


86-/// 


12-/// 


37—1 


62-// 


87— 


13-/ 


38— 


63— 


88-//// 


14r-l 


39 — / 


64r—fHi / 


89— m 


15—1 


40-/// 


66-^ — 


90 — m / 


16 — HI 


41— 


6ft—/ 


91-///, 


ir- 


42-IHl 


67— If 


92—1 


is— M/ 


43-/ 


65-m If 


93— I 


19— 


44— M/ 


69-// 


94—/ 


20 — f 


45— m/ 


70— f 


95-/// 


21—11 


46-/// 


71— I 


96— 


22— m 


47— ■ 


72-lfll 


97— mi 


23 — H 


48-/// 


73— 


98—1 


24— 


49— m 


74-//// 


99— 


25—1 


50— m 


75-/ 


0ft-// 



e. If the student will bring to bear upon this problem the principles he learned in Section Y 
of this text, he will soon realize that what he now has before him are four, simple, monoalpha- 
betic frequency distributions similar to those involved in a monoalphabetic substitution cipher 
using standard cipher alphabets. The realization of this fact immediately provides the clue to 
the next step: “fitting each of the distributions to the normal.” (See Par. 176). This can be 
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done without difficulty in this case (remembering that a 25-letter alphabet is involved and 
tuwiimiTig that I and J are the same letter) and the following alphabets result: 



01— I-nJ 


26— U 


61— N 


76— E 


02— K 


27— V 


62 — 0 


77— F 


03— L 


28— W 


63— P 


78— G 


04— II 


29— X 


64 — Q 


79— H 


05 — N 


30— y 


55 — R 


80— W 


06—0 


31— Z 


56— S 


81— K 


07— P 


32— A 


67— T 


82— L 


08— Q 


33— B 


58— U 


83 — M 


09— R 


34r— C 


59— V 


84— N 


10— S 


35 — D 


6fr— W 


86—0 


11— T 


36— E 


61— X 


86— P 


12— U 


37— F 


62— Y 


87— Q 


13— V 


38— G 


63— Z 


88— R 


14— W 


39— H 


64— A 


89— S 


15 — X 


40— I-nJ 


65 — B 


90— T 


16— Y 


41— K 


66— C 


91— U 


17— Z 


42— L 


67— D 


92— V 


18— A 


43— M 


68— E 


93— W 


19— B 


44f— N 


69— F 


9^X 


20 — C 


46—0 


70— G 


95— y 


21— D 


46 — P 


71— H 


96— Z 


22— E 


47 — Q 


72— I-J 


97— A 


23— F 


46— R 


73— K 


98— B 


24— G 


49— S 


74— L 


99— C 


25— H 


50— T 


76— M 


00— D 



/. The keyword is seen to be JUNE and the first few groups of the cryptogram decipher as 
follows: 

68 32 10 90 22 48 05 76 51 11 88 64 84 20 36 45 23 
EASTERNENT RANCEOF 

g. From the detailed procedure given above, the student should be able to draw his own 
conclusions as to the procedure to be followed in solving cryptograms produced by methods 
which are more or less simple variations of that just discussed. In this connection he is referred 
to Section X of Mementary Military Cryptography, wherein a few of these variations are mentioned. 

A. Possibly the most important of the variations is that in which a rectangle such as that 
shown in Fig. 17 is employed. 





1 


2 


3 


4 


5 


6 


7 


8 


9 


0 


1. 4, 7 


A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


2, 5, 8 


K 


L 


H 


N 


0 


P 


Q 


R 


S 


T 


3, 6, 9 


U 


V 


V 


X 


Y 


Z 


- 


9 




9 



FlOVU 17 
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In the solution of cases of this kind, repetitions would play thdr usual role, with the modifications 
noted below in Par. 39. Once an entering wedge has been forced, through the identification 
of one or more repeated words such as BATTALION, DIVISION, etc., the entire enciphering 
rectangle would soon be reconstructed. It may be added that the frequency distribution for 
the text of a single long message or several short ones enciphered by such a system would show 
characteristic phenomena, the most important of which are, first, that the distribution for a 
rectangle such as shown in Fig. 17 would practically follow the normal and, second, that the 
distribution for the 2d digit of pairs would show more marked crests and troughs than the 
distribution for the 1st digit. For example, the initial digits 1, 4, and 7 (for the numbers 10-19, 
40-19, and 70-79, inclusive) would apply to the distribution for the lettffls A to J, inclusive; the 
initial digits 2, 5, and 8 would apply to the distribution for the letters K to T, inclusive. The 
total weighted frequency values for these two groups of letters are about equal. Therefore, 
the frequencies of the initial digits 1, 2, 4, 5, 7, and 8 would be approximately equal. But 
consider the final digit 5 in the numbers 15, 45, 75, 25, 55, and 85; its total frequency is com- 
posed of the frequency of plus the frequency of Op; whereas in the case of the final digit 6, 
its total frequency is composed of the frequency of Fp plus the frequency of Qp. The two cases 
would show a marked difference in frequency. Of course, the letters may be inserted within 
the enciphering rectangle in a keyword-mixed or even in a random order; the numbers may be 
applied to the rectangle in a random order. But these variations, while increasing the difficulty 
in solution, by no means make the latter as great as may be thought by the novice. 

39. Solution of a more complicated example.— u. As soon as a beginner in cryptography 
realizes the consequences of the fact that letters are used with greatly varying frequencies 
in normal plain text, a brilliant idea very speedily comes to him. Why not disguise the 
natural frequencies of letters by a system of substitution using , many equivalents, and let 
the numbers of equivalents assigned to the various letters be more or less in direct proportion 
to the normal frequencies of the letters? Let E, for example, have 13 or more equivalents; T, 10; 
N, 9; etc., and thus (he thinks) the enemy cryptanalyst can have nothing in the way of tell-tale 
or characteristic frequencies to use as an entering we%e. 

b. If the text available for study is small in amount and if the variant values are wholly 
independent of one another, the problem can become exceedingly difficult. But in practical 
military communications such methods are rarely encountered, beemute the vdutne of text is usuaUy 
great enough to permit of the eetdbliehment of equivalent valuee. To illustrate what is meant, 
suppose a set of crypto^wns produced by the monoalphabetic-variant method described above 
shows the following two sets of groupings in the text: 

Set a Set B 



12-37-02-79-68-13-03-37-77 

82-69-03-79-13-68-23-37-35 

82-69-51-16-13-13-78-05-35 

91-05-02-01-68-42-78-37-77 



71-12-02-51-23-05-77 

11-82-51-02-03-05-35 

11-91-02-02-23-37-35 

97-12-51^3-78-69-77 



An examination of these groupings would lead to the following tentative conclusions with regard 
to probable equivalents: 



12, 82, 91 01, 16, 79 03, 23, 78 

05,37,69 13,42,68 35, and 77 

02, and 51 

The establishment of these eqmvalencies would sooner or later lead to the findii^ of additional 
sets of equal values. The completeness with which this can be accomplished will determine 
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the ease or difficulty of solution. Of course, if many equivalencies can be established the 
problem can then be reduced practically to monoalpbabetic terms and a speedy solution can 
be attained. 

e. Theoretically, the determination of equivalencies may seem to be quite an easy matter, 
but practically it may be very difficult, because the cryptanalyst can never be certain that a 
combination showing what may appear to be a variant value is really such, and is not a different 
word. For example, take the groupsr^ 

17-82-31-82-14-63, and 
27-82-40-82-14r-63 

Here one might suspect that 17 and 27 represent the same letter, 31 and 40 another letter. But 
it happens that one group represents the word MANAGE, the other DAMAGE. 

d. When revernble combinations are used as variants, the problem is perhaps a bit more 
simple. For example, using the accompanying Fig. 18 for encipherment, two messages with 
the same initial wolds, REFPIENCE YOUR, may be enciphered as follows: 





K,Z 


Q.V 


B.H 


M,R 


D,L 


w,s 


N 


H 


A 


0 


E 


F.X 


D 


T 


M 


F 


P 


G,J 


Q 


B 


U 


I 


V 


C.N 


G 


X 


R 


C 


s 


P.T 


Z 


L 


Y 


W 


K 



riGUBI u. 



RE F ER E N C EY OU R 

(1) N H W D R X L S H C D W W ^ N RSLHP skBjd 5 

(2) CHDWR XSLHN DVZWN RLSHP RWJBN H 

The experienced cryptanalyst, notii^ the appearance of the very first few groups, assumes that 
he is here confronted with a case involving biliteral reversible equivalents, with variants. 

e. The probable-word method of solution may be used, but with a slight variation intro- 
duced by virtue of the fact that, regardless of the system, letten of low Jregueney in plain text 
remain injrequenl. Hence, suppose a word containing low-frequency letters, but in itself a 
rather conunon word strikingly idiomorphic in character is sought as a “probable word”; for 
example, words such as CAVALRY, ATTACK, and PREPARE. Writing such a word on a slip of 
paper, it is slid one interval at a time under the text, which has been marked so that the high 
and low-frequency characters are indicated. Each coincidence of a low-frequency letter of the 
text with a low-frequency letter of the assumed word is examined carefully to see whether the 
adjacent text letters correspond in frequency with the other letters of the assumed word; or, if 
the latter presents repetitions, whether there are correspondences between repetitions in the 
text and those in the word. Many trials are necessary but t,bi« method will produce results 
when the difficulties are otiierwise too much for the cryptanalyst to overcome. 

40. A subterfuge to prevent decomposition of cipW text into component units. — a. A few 
words ^ould be added with regard to certain subterfuges which are sometimes encountered in 
monoalpbabetic substitution with variants, and which, if not recognized in time, cause con- 
siderable delays. These have to deal with the insertion of nulls so as to prevent the cryptanalyst 
from breaking up the text into its rosi Ciyptographic units. The student should t^e careful 
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note of the last phrase; the mere insertion of symbols having the same characteristics as the 
symbols of the cryptographic text, except that they have no meaning, is not what is meant. 
This class of nulls rarely achieves the purpose for which they are intended. What is really meant 
can best be explained in connection with an example. Suppose that a 5 x 5 checkerboard design 
with the row and column indicators shown in £%. 19 is adopted for encipherment. Normally, 
the cipher units would con^t of 2-letter combinations of the indicators, invariably giving the 
row indicator first (by agreement). 

V G I W D 

A H P S M 

T 0 E B N 

F U R L C 



V,A,T,F 


A 


B 


C 


D 


E 


G.H.O.U 


F 


G 


H 


I-J 


K 


I.P.E.R 


L 


H 


N 


0 


P 


W,S,B.L 


Q 


R 


S 


T 


U 


D,H,N,C 


V 


W 


X 


Y 


Z 



rwiMit. 

The phrase COmiANDER OF SPECIAL TROOPS might be enciphered thus: 

C 0 M U A N D E R O F ... 

VI EB PH lU FT IE AB TM WO PW GT . . . 

These would normally then be arranged in 5-letter groups, thus: 

VI EBP HIUFT lEABT MWOPW GT... 

b. It win be noted, however, that only 20 of the 26 letters of the alphabet have been employed 
as row and colunm indicators, leaving J, K, Q, X, Y, and Z unused. Now, suppose these five letters 
are used as hulls, not in pairs, but as itidwidiial letters inserted at random just before the real text is 
arranged in S-letter groups. OccasionaUy, a pair of nulls is inserted. Hius, for motmple: 

VIEXB PHKIU FJXTIEAJBT MWOQP WGKTY 

The cryptanalyst, after some study, suspecting a biliteral cipher, proceeds to break up the text 
intapairs: 

VI EX BP HK III FJ XT IE AJ BT HW OQ PW GK TY 

Compaie this set of 2-letter combinations with the correct set. Only 4 of the 15 pairs are “proper” 
units. It is easy to see that without a knowledge of the existence of the nulls, and even with a 
knowledge, if he does not know which letters ore nulls, the cryptanalyst would be confronted with 
a problem for the solution of which a fairly large amount of text might be necessary. The 
careful employment of the variants also very materially adds to the security of the me&od be- 
cause repetitions can be rather effectively suppressed. 

e. From the cryptographic standpoint, the fact that in this system the cryptographic text 
is more than twice as long as the plain text constitutes a serious disadvantage. From the 
cryptanalytic standpoint, the masking of the cipher unite constitutes the most important source 
of strengtii of the system; this, coupled with the use of variants, makes it a bit more difiicult 
system to solve, despite its monoalphabeticity. 
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41. Konographic and polygraphic substitution systems. — a. The student is now referred 
to Sections VII and VIII of Advanced Military Cryptography /yrherem polygraphic systems of 
substitution are discussed from the cryptographic point of view. These will now be discussed 
from the cryptanalytic point of view. 

b. Although the essential differences between polyliteral and polygraphic substitution are 
treated with some detail in Section VII of Advanced Military Cryptography, a few additional 
words on the subject may not be amiss at this point. ' 

c. The two primaiy divisions of substitution systems into (1) uniliteral and multiliteral 
methods and into (2) monographic and polygraphic methods are both based upon considerations 
as to the number o/ elemente constituting the plaiu<text and the equivalent d.pher’-text units. In 
uniliteral as well as in monographic substitution, each plain-text unit consists of a single element 
and each cipher-text unit consists of a sin^ element. The two terms imiliteral and mono- 
graphic aie therefore identical in significance, as defined cryptographically. It is when the 
terms multiliteral and polygraphie are examined that an essential difference is seen. In multi- 
literal substitution the plain-text unit always consists of^a single element (one letter) and the 
cipher-text unit consbts of a group of two or more elements ; when biliteral, it is a pair of elements, 
when triliteral, it is a set of three elements, and so on. In what will herein be desi^ated as 
tine or complete polygraphic substitution the plain-text unit consists of two or more elements 
forming an indivisible compound; the cipher-text unit usually consists of a corresponding number 
of elements.* When the number of elements comprising the plain-text umts is fixed and always 
two, the system is digraphic; when it is three, the system is trigraphic; when it is four, fefm- 
graphic; and so on.* It is important to note tiiat in true or complete polygraphic substitution 
the elements combine to form indivisible compounds having properties different from those of 
either of the constituent letters. For example, in uniliteral substitution ABp may yield XY, and 
ACp may yield XZ, ; but in true digraphic substitution ABp may yield XYe and ACp may yield QN«. 
A difference in identify of one letter affects the whole result.* An analogy is found in chemistiy. 
When two elements combine to form a molecule, the latter usuaUy having properties quite 
different from those of either of the constituent elements. For example: sodium, a metal, and 

* The qualifying adverb “usually” is employed because this correspondence is not essratial. For example, 
if one should draw up a set of 676 arbitrary sine^e signs, it would be possible to represent the 2-letter pairs from 
AA to ZZ by single symbols. This wpuld still be a digraphic system. 

* In this sense a code system is merely a polygraphic substitution system in which the number of elements 
constituting the plain-text units is variable. 

* For this reason the two letters ate marked by a ligature, that is, by a bar across their tops. 

(70) 
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chlorine, a gas, combine to form sodium ribloride, common table salt. Furthermore, sodium and 
fluorine, also a gas similar in many respects to chlorine, combine to form sodium fluoride, which 
is much different from table salt. Partial and pseudo-polygraphio substitution will be treated 
under subparagraphs d and e below. 

d. Another way of looking at polygraphic substitution is to regard the elements comprising 
the plain-text units as being enciphered individually and polyalphabetically by a fairly large 
number of separate alphabets. For example, in a digraphic system in \riiich 676 pairs of plain- 
text letters are representable by 676 cipher-text pairs assigned at random, this is equivalent to 
having a set of 26 different alphabets for enciphering one member of the pairs, and another set 
of 26 different alphabets for enciphering the other member of the pairs. According to this 
viewpoint the different alphabets are brought into play by the particular combination of letters 
forming each plain-text pair. This is, of course, quite different from systems wherrin the various 
alphabets are brought into play by more definite, rules; it is perhaps this very absence of definite 
rules guiding the selection of alphabets which constitutes the cryptographic strength of this type 
of p<flygraphic system. ' 

e. When regarded in the li^t of the preceding remarks, certain systems which at first glance 
seem to be polygraphic, in that groupings of plain-text letters are treated as units, on doser 
inspection are seen to be only partially polygraphic, or pseudo-polygraphic in character. For 
example, in a system in which endpherment is by pairs and yet one of the letters in each pair is 
enciphered monoalphabetically, the other letter, polyalphabetically, die method is only “psuedo- 
polygraphic. Cases of this type are shown in Section VII of ASoimced Military Ory^ography. 
Again, in a system in which encipherment is by pairs and the enciidierments of the left-hand 
and right-hand members of the pairs show group rdationships, this is not pseudo-polygraphic 
but only partially polygraphic. Cases of this type are also shown in the text referred to above. 

/. The fundamental purpose of polygraphic substitution is again the suppression of tiie 
frequency characteristics of plain text, just as is the case in monodphabetic substitution with 
variants; but here this is accomplished by a different method, the latter arismg from a somewhat 
different approach to tiie problem involved in producing cryptographic security. When the sub- 
stitution involves replacement of single letters in a monoalphabetic system, the cryptogram can 
be solved rather readily. Basically the reason for this is that the principles of frequency and the 
laws of probability, applied to individual units of the text (single letters), have a very good 
opportunity to manifest themsdves. A given volume of text of say n plain-text letters, endphered 
purely monoalphabetically, affords n cipher characters, and the same number of cipher units. 
The same volmne of text,, enciphered digraphically, still affords n cipher characters but only 

ft 

2 cipher units. Statistically speaking, the sample within which the laws of probability now apply 
has been cut in half. Furthermore, from the point of view of frequency, the very noticeable 
diversity in the frequencies of individual letters, leading to the marked crests and troughs of 
the uniUteral frequency distribution, is no longer so strikingly in evidence in the frequencies of 
digraphs. Therefore, although true digraphic encipherment, for example, cuts the cryptographic 
textual units in half, the difficulty of solution is not doubled, but, if a matter of judgment arising 
from practical experience can be expressed or approximated mathematically, squared or cubed. 

g. Sections '^I and VIII of Advanced Military Cryptography show various methods for the 
derivation of polygraphic equivalents and for handling these equivalents in cryptographing and 
decryptographing messages. The most practicable Of those methods are digraphic in character 
and for this reason their solution will be treated in a somewhat more detailed mumer than will 
ttigraphic methods. The latter can be passed over with the simple statement that their analysis 
requires much text to permit of solution by the frequency method, and hard labor. Fortunately, 
they are infrequently encoimtered because they are difficult to manipulate without extensive 
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tables.^ If the latter are required they must be compiled in the form of a book or pamphlet. If 
one is willing to go that far, one might as well include in such document more or 1^ extensive lists 
of words and phrases, in which case the system falls under the category of code and not c4>her. 

42. Tests for identifying digraphic substitution. — a. The tests winch are applied to deteiv 
mine whether a given cryptogram is digraphic in character are usually rather simple. If there 
are many repetitions in the cryptogram and yet the uniliterat-frequency distribution gives no 
oleaiM$ut indications of monoalphabeticity; if most of the repetitions contain an even number 
of letters; and if the cryptogram contains an even number of letters, it may be assumed to be 
digraphic in nature. 

b. The student ehould first tty to determine whether the substitution is completely (Ugraphio, 
or only partially digraphic, or pseudo-digraphic in character. As mmilioned above, there are 
cases in which, ^though the substitution is effected by taking pairs of letters, one of the members 
hf tiie pairs is enciphered monoalphabetically, the other member, polyalpbabetically. A dis- 
tribution based upon the letters in the odd positions and one based upon those in the even 
positions should be made. If one of these is clearly monoalphabetic, then this is evidence that the 
message represents a case of pseudo-digraphism of the type here described. By attacking the 
monoalphabetic portion of the messages, solution can soon be reached by slight variation of the 
usual method, the polyalphabetic portion being solved by the aid of the context and oonridera- 
tions based upon the probalde nature of the substitution chart. (See Tables 2, 3, and 4 of 
Advanced MiUtary Oryptografhy.) It will be noted that the charts referred to riiow definite 
symmetry in their construction. 

e. On the other hand, if the foregoing steps prove fruitless, it may be assumed that the 
csyptogram is completely digraphic in character. 

d. Just as certain statistical tests may be applied to a cryptogram to establish its mono- 
alphabeticity, so also may a statistical test be appfied to a ciyptt^ram for the purpose of estab- 
lishing its d^raphicity. The nature of this test and its method of application will be discussed 
in a subsequent text. 

48. General procedure in the analysis of digraphio substitution dphers.^-a. Theanalymsof 
cryptograms which have been produced by digraphic substitution is accomplished largely by 
the application of the simple principles of frequency of digraphs, with the additional aid of such 
special circumstances as may be known to or suspected by the cryptanalyst. The latter refer 
to peculiarities which may be the result of the particular method employed in obtaining the 
equivalents of the plain-text digraphs in the cryptographing process. In general, however, 
only if there is sufficient text to disclose the normal phenomena of rei>etition will solution be 
feasible or possible. 

b. However, when a digraphic system is employed in regular service, there is little doubt 
but that traffic will rapidly accumulate to an amount more than suffident to permit of solution 
by simple prindples of frequenty. Sometimes only two or three long messages, or a half dozen 
of average length are sufficient. For with the identification of only a few cipher digraphs, 
larger portions of messages may be read because the skeletons of words formed from the few 
high-frequency digraphs very definitely limit the values that can he inserted for the intervening 
unidentified digraphs. For example, suppose that the plain-text digraphs TH, ER, IN, IS, OF, 
NT, and TO have been identified by frequency considerations, corroborated by a tentatively 
identified long repetition; and suppose also that the enemy is known to be using a quadricular 

* A patent haa been granted upon a rather ingenious machine for automatically accomplishing true poly- 
graphlo substitution, but it has not been placed upon the market. See U. S. Patent No. 184S047 issued in 1932 
to Weisner and Hill. In U. S. Patent No. 1615680 issued to Henkels in 1024, there is described a mechanism 
which also produces polygraphic substitution. 
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table of 676 eeUs containing digraphs showing reciprocal equivslence between plain and cipher- 
text (graphs. Suppose the message begins as follows (hi which the assumed raluea have been 
inserted): 

XQ VO ZI LK AP OL ZX PV QN IK OL UK AL HN LK VL 

FO TH IN NT RE NT NO IN 

BN OZ KU DY EL LE YW 

SI ON TO 

The words FOURTH INFANTRY REGIMENT are readily recognized. The reciprocal pairs EL. 
and LE, su^st ATTACK. The b^inning of the message is now completely disclosed: FOURTH 
INFANTRY REGIMENT NOT YET IN POSITION TO ATTACK. The values more or less automatic 
cally detamined are V0,=URp, AL,— TYp, HN,=ETp, VL,=P0p, 0Z,=TIp, YW.— CKp. 

e. Once a good start has been made and a few words have been solved, subsequent work 
is quite simple and straightforward. A knowledge of enemy correspondence, including data 
regarding its most common words and phrases, is of great assistance in breaking down new 
digraphic tables of the same nature but with different equivalents. 

d. The foregoing remarks also apply to the detuls of solution in cases of partially 
digraphic substitution. . 

44. Anslysw of digraphic substitution dphers based upon 4-square oheckerboard designs. — 
a. In Section VIIl of Advanced Miliiairy Cryptography there are diown various ezmnples of di- 
graphic substitution based upon the use of checkerboard dedgns. These may be considered 
cases of partially digraphic tobstitution, in that in the checkerboard Gystem tiiere are certain 
relationships between plaia^text digraphs having common elements and their corresponding 
cipher-text digraphs, which will also have common elements. For example, take the following 
4-square checkerboard design: 



B 


W 


G 


R 


M 


0 


P 


A 


U 


L 


N 


Y 


V 


X 


E 


H 


Z 


Q 


D 


F 


S 


I 


C 


T 


K 


K 


I 


T 


S 


C 


u 


P 


L 


A 


0 


M 


w 


R 


B 


G 


D 


Z 


F 


Q 


H 


E 


Y 


X 


N 


V 


ff 


A 


L 


E 


S 


C 


X 


K 


P 


B 


F 


H 


u 


I 


T 


0 


M 


Y 


D 


V 


P 


X 


B 


K 


C 


s 


A 


E 


W 


L 


N 


Z 


R 


Q 


G 


G 


Z 


Q 


N 


R 


0 


M 


V 


Y 


0 


T 


H 


I 


F 


U 



ItOTIBiaB. 



Here BCp=0W., B0p=0F„ BSp=OP„ BG^=0N, and BTp=OD,. In each case when Bp b the initial 
letter of the pl^-text pair, the initial letter of the dpher-text equivalent b 0.. Thb, of course, 
b the direct result of the method; it means that the endphennent b monoalphabetb for the 
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first half of each of these plain-text pairs, polyalphabetic for the second half. This relation- 
ship holds true for /our other groups of pairs beginning with Bp. In other words, there are five 
alphabets employed, not 25. Thus, this case differs from the case discussed under Par. 425 
only in that the monoalphabeticity is not complete for one-half of all the pairs, but only among 
the members of certain groups of pairs. In a completely digraphic system using a 676-cell 
randomized square, such relationships are entirely absent and for this reason the system is 
cryptographically more secure than the checkerboard system. 

b. From the foregoing, it is clear that when solution has progressed sufficiently to disclose 
a few values, the insertion of letters within the cells of the checkerboard design to give the plain- 
text and cipher relationships indicated by the solved values immediately leads to the disclosure 
of additional values. Thus, the solution of only a few values soons leads to the breakdown of 
the entire checkerboard design. 

e. (1) The following example will serve to illustrate the procedure. Let the message be as 
follows: 

12846 <782 10 U 12 IS 14 18 1<17 181S20 21 22 28 24 26 26 27 28 28 30 

A. HFCAP GOQIL B S P K U NDUKE OHQNF BORUN 

B. QCLCH QBQ BF HM AFX SIOKO QYFNS XMCGY 

C. X I F B E X AFDX LPMXH HRGKG QK Q M L F E QQ I , 

i). ,G 0 I H M UEORD CLTU F EQO CG QNHF X IFBEX 

E. FLBUQ F C-H Q 0 QMAFT XSYCB EPFN B SPK WU 

F. QITXE U QMLF EQQIG OI EUE HP I A M Y T F t. B 



G. FEEPI DHPCG NQIH B F H M HF XCKUP DGQPN 

H. CBCQL QPNFN PNITO RTENC 0$CN T F H H A Y, 

I. ,ZLQCI AAIQO CHTP C BIFGW KFCQS LQMCB 

J. OYCRQ QDPRX FN QML F IDGC CGIO G OIH HF 

K. IRCGG GNDLN OZTFG EERRP IFHO T F H H A Y, 

L. ^Z L Q C I A A I Q U CHTP 



(2) The cipher having been , tested for standard alphabets (by the method of completing 
the normal components) and found to give negative results, a uniliteral-frequency distribution 
is made. It is as follows: 



A B C D 
11 16 26 8 



=5 g g 

g g g 
g g g 
g g g 

G H I 

17 22 24 



JKLMNOPQRSTUVWXYZ 

0 8 14 11 18 16 16 83 8 6 11 11 0 1 12 7 8 

noon 21. 



(3) At first ^ance this may appear to the imtrained eye to be a mondalphabetic frequency 
distribution but upon closer inspection it is noted that aside from the frequencies of four dr five 
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letters the frequencies for the remainii^ letters ai« not very dissiiiular. There Are, in reality, no 
very maxk^ crests and troughs, certainly not as many as would be expected in a monoalphabetic 
substitution cipher of equal length. 

(4) The message having been carefully examined for repetitions of 4 or more letters, all of 
tiiem are listed: 



i 


Frequency 


Located in lines 


TFKAYZLQCIAAIQDOirP (20 letters) 


2 


H and K. 


QMLFEQQIGOI (11 letters) 


2 


C and F. 


XIFBEX (6 letters) 


2 


C and D. 


FEQQ, ' . 


3 


c, D. f; ; 


QMT.E 


3 


C, F, J. 


BPHM. 


2 


B and O. 


BSPK 


2 


A and F. 


goth: . 


2 


D and J. 









Since there are quite a few repetitions, two of considerable length, since all but one of them 
contain an even number of letters, and sinoe the message also contains an even number of letters, 
344, d^raphic substitution is suspected. The cryptogram is transcribed in 2-letter groups, for 
greater convenience in study. It is as follows: 



Message transcribed in pairs 





r 


s 


t 


4 


5 


« 


7 


8 


• 


10 


11 


la 


IS 


14 


u 


A. 


HF 


CA 


PG 


OQ 


IL 


IS- 


JPK 


MN 


DU 


KE 


OH 


ON 


FB 


OR 


UN 


B. 


QC 


LC 


HQ 


BQ 


BF 


HM 


AF 


XS 


10 


KO 


QY 


FN 


SX 


MC 


GY 


C. 


XI 


FB 


EX 


AF 


DX 


LP 


MX 


HH 


RG 


KG 


QK 


fiJL 


LF 


EQ 


-QI 


D. 


GO 


IH 


HU 


EO 


RD 


CL 


TU 


FE 


QQ 


CG 


ON 


HF 


XI 


FB 


EX 


E. 


FL 


BU 


QF 


CH 


QO 


QM 


AF 


TX 


SY 


CB 


EP 


FN 


BS 


PK 


NU 


F. 


QI 


TX 


EU 




LF 


EQ 




GO 


_IE 


UE 


HP 


lA 


NY 


TF 


LB 


G. 


FE 


EP 


ID 


HP 


CG 


NQ 


IM 


BF 


HM 


HF 


XC 


KU 


PD 


GQ 


PN 


H. 


CB 


CQ 


LQ 


PN 


FN 


PN 


IT 


OR 


TE 


NC 


CB 


CN 


TF 


HH 


AY 


J. 


ZL 


j9c_ 


lA 


AI 


_asL 


CH 


_TP 


CB 


IF 


GW 


KF 


CQ 


SL 


QM 


CB 


K. 


OY 


CR 


QQ 


DP 


RX 


FN 


QM 




ID 


GC 


CG 


10 


GO 


IH 


HF 


L. 


IR 


CG 


GG 


ND 


LN 


OZ 


TF 


GE 


ER 


RP 


IF 


HO 


TF 


HH 


AY 


M. 


ZL 


_5C_ 


lA 


AI 


_fflL 


pH 


TP 



















It is noted that all the repetitions listed above break up properly into digraphs except in 
one case, viz, FEQQ in lines C, D, and F. This seems rather strange, and at first thought one 
might suppose that a letter was dropped out or was added in the vicinity of the FEQQ in line D. 
But it is immediately seen that the FE QQ in line D has no relation at all to the . F EQ Q. in 
lines C and F, and that the F EQ Q in line D is merely an accidental repetition. 
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(S) A digraphic frequeaoy diatiibution ‘ is made and is shown in Fig. 22. 

AB C DE FGH I KLU N OPQRSTUVWXYZ 




noun A 

(6) The appearance of the forcing distribution for this message is quite characteristic of 
that for a digraphic substitution cipher. There are many blank cells; although there are many 
cases in which a digraph appears only once, there are quite a few in which a digraph appears 
two or three times, foiu: cases in which a digraph appears four times, and two cases in which a 
digraph appears five times. The absence of the letter J is also noted; this is often the case in a 
digraphic system based upon a checkerboard design. 

* The distinction between "digraphic” and "biliteral" is based upon the following consideration. In a 
biliteral (or diliteral) distribution every two successive letters of the text would be grouped together to form a 
pair. For example, a bilitend distribution of ABCDEP would tabulate the pairs AB. BC. CD, DB, and EF. In a 
digraphio distribution only successive pairs of the text are tabulated. For example, ABCD^ would yield only 
AB, CD; and EF. 
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(7) In another common t3^e of checkerboard ^tem known as the Playfair cq>her,de8c!tibed 
in Par. 46, one of the telltale indications besides the absence of the letter J is the absence of double 
letters, that is, two successive identical letters. The occurrence of the douUe letters GG, HH, 
and QQ in the message under investi^tion eliminates the possibility of its being a Playfair 
cipher. The simplest thing to assume is that a 4-square chedrerboaid is involved. One with 
normal alphabets in Sections 1 and 2 is therefore set down (Fig. 23a). 



B C 



G H I-J K 



N 



R S 



W X 



B 


0 


G 


H 


M 


N 


R 


S 



Fioxnti 23a. 

(8) The recurrence of the group QHLF, three times, and at intervals suggesting that it mi ght 
be a sentence separator, leads to the assumption that it is the word STOP. The letters Q, M, L, 
and F are therefore inserted in the appropriate cells in Sections 3 and 4 of the diagram. Thus 
(Pig. 236): 

ABODE 
F G H I-J K 

1 L H N 0 P L 3 

q R S T U Q 

V W X Y Z 

ABODE 
F G H I-^ K 

4 F LHN0P2 

H Q R S T U 

V W X Y Z 



148274—38 6 



riQVBl 286. 
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These placements seem logical. Moreover, in Section 3 the number of ceUs between L and 
Q is just one less than enough to contain all the letters M to P, inclusive, and suggests that either 
N or 0 is in the keyword portion of the sequence, that is, near the top of Section 3. Without 
nutlring a commitment in the matter, suppose both N and 0, for the present, be inserted in the 
cell between It and P. Thus (fig. 23e): 




noin»23e. 



(9) Now, if the placement of P in Section 3 is correct, the cipher equivalent of THp will be 
PGa, and there should be a group of adequate frequency to correspond. Noting that PN, occurs 
three times, it is assumed to be THp and the letter N is inserted in the appropriate cell in Section 4. 
Thus (Fig. 23d): 

ABODE 
F G H I-J K 



L H N 0 P 



L 3 



Q R S T U H 



P Q 



V W X Y Z 



ABODE 
F G H K 
L M N 0 P 2 



Q R S T U 
V W X Y Z 



RauBl2M. 

































REF ID : A64646 



79 

(10) it is about time to try out th^ assumed values in the message^ ^The ipioper insertioitt 
are made, with the following results: 





I 


1 


s 


4 


< 


8 


7 


8 


• 


u 


u 


u 




.. 14 


u 


A. 


HF 


CA 


PG 


OQ 


IL 


BS 


PK 


MN 


DU 


KE 


OH 


ON 


FB 


O'R 


UN 


B. 


QC 


LC 


HQ 


BQ 


BF 


HM 


AF 


XS 


10 


KO 


QY 


FN 


SX 


HC 


GY 


C. 


UL 


fjB 


-SE 


AF 


DX 


LP 


MX 


HH 


RG 


KG 


ilK 




LF 


EQ 


SL, 


























ST 


"op 






D. 


,G0 


IH 


UU 


EO 


RD 


CL 


TU 


FE 


QA- 


CG 


ON 


HF 


XL. 


FB 


EX 


E. 


FL 


BU 


QF 


CH 


QO 


QM 


AF' 


TX 


SY 


CB 


EP 


FN 


BS 


PK 


NU 














ST 






. 














F. 


QI 


TX 


EU 


QM LF 


EQ 


JIL 


GQ 


IE 


. UE 


HP 


lA 


NY 


TF 


LB 










ST 


OP 






















G. 


FE 


EP 


ID 


HP 


CG 


NQ 


IH 


BF 


HM 


HF 


XC 


KU 


PD 


GQ 


PN 
































TH 


H. 


CB 


CQ 


LQ 


PN 


FN 


PH 


IT; 


OR 


TE 


NC 


CB 


CN 


TF 


HH 


-A)U 










TH 




TH 




















J. 


,ZL 




lA 


AI 




CH 


TP 


CB 


IF 


GW 


KF 


• CQ 


SL 


QM 


CB 




























ST 




K. 


OY 


CR 


QQ 


DP 


RX 


FN 


OIL 


LF 


ID 


GC 


CG 


10 


GO 


IH 


HF 
















ST 


OP 
















L. 


IR 


CG 


GG 


ND 


LN 


OZ 


TF 


GE 


ER 


RP 


IF 


HO 


TF 


HH 


AY, 


M. 




_fi£_ 




AI 


_fflL 


CH 


TP 



















(11) So far no impossible combinations are in evidence. Beginning with group H4 iu the 
message is seen the following sequence: 

P N F N P N 
T H . . T H 

AHaiinift it to be THAT THE. Then ATp=FN„ and the letter N is to be inserted in row 4 column 1. 
But this is inconsistent with previous assumptions, since N in Section 4 has already been tenta>- 
tively placed in row 2 column 4 of Section 4. Other assumptions for FN, are made: that it is, 
ISp(THIS TH. .•.);thatitisENp (THENTH. . .); but the same inconastency is apparent. In fact 
the student will see that FN, must represent a digraph ending in F, G, H, I-J, or K, since N, is 
tentatively located on the same line as these letters in Section 2. Now FN, occurs 4 times in 
the message. The digraph it represents must be one of the following: 

DF, DG, DH, DI, DJ, DK 
IF, IG, IH. II. IJ, IK 
JF, JG, JH, JI, JJ, JK 
OF, OG, OH, 01, OJ, OK 
TK, 

YF, YG. YH, YI, YJ, YK 
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Of these the only one likely to be repeated 4 times is OF, yielding T H 0 F T H which may be 

P N F N P N 



a part of 

.NORTHOFTHE. .SOUTHOFTHE. 

CQLQPNFNPNIT CQLQPNF N P NIT 



In either case, the position of the F in Section 3 is excellent: F : . . L in row 3. There are 3 
cells intervening between F and L, into which G, H, I-J, and K may be inserted. It is not nearly 
so likely that G, H, and K are in the keyword as t^t I should be in it. Let it be assumed that 
this is the case, and let the letters be i>laced in the appropriate cells in Section 3. Thus (Fig. 23e): 



A 


B 


C 


D 


E 












F 


G 


H 


I-J 


K 












L 


H 


N 


0 


P 


F 


G 


H 


K 


L 


Q 


R 


S 


T 


U 


M 


H 

0 


P 


Q 




V 


W 


X 


Y 


z 






















A 


B 


C 


D 


E 








N 




F 


G 


H 


7 

w 


K 








F 




L 


H 


N 


0 


P 






11 


Q 




Q 


R 


S 


T 


U 












V 


W 


X 


Y 


Z 



FlOVBB 3b. 



Let the resultant derived values be checked against the frequency distribution. If the petition of 
H in Section 3 is correct, then the digraph ONp, normally of high frequency tiiould be represented 
several times by HF«. Beference to Fig. 22 shows a frequency of 4 times. And Hlf,, with 2 occur- 
rences, represents NSp. There is no need to go through all the possible corroborations. 

(12) Going back to the assumption that T H . TH 

PNFNPN 

is part of the expression 

.NORTHOFTHE. , S 0 U T H 0 F T H E . , 
CQLQPNFNPNIT CQLQPNFNPNIT 

it is seen at once from Fig. 23e that the latter is apparently correct and not the formw, because 
LQ, equals OUp and not ORp. If eSp=CQ,, this means that the letter C of the digraph CQg must be 
placed in row 1 column 3 or row 2 column 3 oi Section 3. . Now the digraph CB, occurs 5 times, 
CG,, 4 times, CH,, 3 times, CQ«, 2 times. Let an attempt be made to deduce the exact position of 
C in Section 3 and the positions qf B, G, and H in Section 4. Since F is already placed in Section 
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4, assume G and H diroctly fellow it, and that B comes before it. How much before? Suppose a 
trial be m^e. • Thus (Fig. 2Sy): 

A B C D E C 

~F G iT w K C 

1 irir”N o””p~"F~^ir'K et 3 



Q R S T U H 



P Q 



V W X Y Z 

A B C D E 
G H I-J K 
~B i B~~F G L M N 0 p 



M Q 



Q R S T U 

V w X y z 



Itovu^. 



By referring now to the frequency distribution, Fig. 22, after a very few minutes of experimenta- 
tion it becomes apparent that the following is correct: 




novnjir. 
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(13) The identifications givQii bj these placements are inserted in the text, and solution 
is very rapidly completed. The final checkerboard and deciphered text are ^ven below. 



A 


B 


C 


D 


E 


S 


0 


C 


I 


E 


F 


G 


H 


I-J 


K 


T 


Y 


A 


B 


D 


L 


M 


N 


0 


P 


F 


G 


H 


K 


L 


Q 


R 


S 


T 


U 


M 


N 


P 


Q 


R 


V 


W 


X 


Y 


Z 


U 


V 


V 


X 


Z 


E 


X 


p 


U 


L 


A 


B 


C 


D 


E 


S 


I 


0 


N 


A 


F 


G 


H 


I-J 


K 


B 


c 


D 


F 


G 


L 


H 


N 


0 


P 


H 


K 


M 


Q 


R 


Q 


R 


S 


T 


U 


T 


V 


W 


y 


Z 


V 


W 


X 


Y 


z 



Rama a*. 



A. 


H F 


C 


A P 


G 0 Q I L 


B S P K M 


M D U k E 


0 H Q N F 


B 


0 


R 


U N 




0 N 


E 


H U 


N D R E D 


FIRST 


FIELD 


A R T I L 


L 


E 


R 


Y F 


B. 


tj C 


L 


C H 


Q B Q B F 


H H A F X 


S I 0 K 0 


Q y P N S 


X M 


C 


G y 




R 0 


M 


P 0 


S 


I T I 0 


N S IB V 


1C INI 


T Y 0 P B 


A 


R L 


0 W 


C. 


X I 


F 


B E 


X A F D X 


L P M X H 


H R G K G ' 


Q K Q H 


L 


F 


E 


Q 


Q I 




W I 


L 


L B 


E I N G E 


N E R A L 


S U P P 0 


R T S T 


0 


P 


D 


U 


R I 


D. 


G 0 


I 


H M 


U B 0 R D 


C L T U F 


E Q Q C G 


Q N H F 


X 


I 


F 


B 


E X 




N G 


A 


T T 


A 


C K 3 P 


E C I A L 


A T T E N 


T 1 


0 N 


N 


I 


L 


L 


B E 


E. 


F L 


B 


U Q 


F C H Q 0 


Q M A F T 


X S Y C B 


E P 


P N 


B 


S 


P 


K N U 




P A 


I 


D T 


0 A S S I 


STING 


A D V A N 


C E 


0 P 


F 


I 


R 


S 


T B 


F. 


Q I 


T 


X E 


U Q H L F 


E Q Q I G 


0 I E U E 


H P 


I A 


N 


Y 


T 


F 


L B 




R I 


GAD 


ESTOP 


D U R I N 


G A D V A 


N C 


E I 


T 


W 


I 


L 


L P 


G- 


F E 


E 


P I 


D H P C G 


N Q I H B 


F H H H F 


X C 


K U 


P 


D 


G 


Q 


P N 




L A 


C 


E C 


0 


N C E N 


T RATI 


0 N S 0 N 


W 0 


0 D 


S 


N 


0 


R 


T H 


H. 


C B 


C 


Q l' 


Q 


P' N F N 


P N I T 0 


R T E N C 


C B 


C N 


T 


F 


H 


H 


A Y 




A N 


D 


S 0 


U 


T H OF 


T H A y E 


R F ARM 


A N 


D H 


I 


L 


L 


S 


I X 


J. 


Z L 


Q 


C I 


A 


A I Q U 


C H T P C 


B I F G W 


K F 


C Q 


S 


L 


Q 


H 


C B 




Z E 


R 


0 E 


I 


G H T D 


A S H A A 


N D 0 N W 


0 0 


D S 


E 


A 


s 


T 


A N 


K. 


0 Y 


C 


R Q 


Q 


D P R X 


F N Q H L 


F I D G C 


C G 


I 0 


G 


0 


I 


H H F 




D W 


E 


S T 


T 


HERE 


0 F S T 0 


P C 0 M H 


E N 


C I 


N 


G 


A 


T 


0 N 


L. 


I R 


C 


G G 


G 


N D L N 


0 Z T F G 


E E R R P 


I P 


H 0 


T 


F 


H H 


A Y 




E T 


E 


N P 


M 


S M 0 K 


E W I L L 


B E U S E 


D 0 


N H 


I 


L 


L S 


I X 


M. 


Z L 


Q 


C I 


A 


A I Q U 


C H T P 


# 


















Z E 


R 


0 E 


I 


G H T D 


ASHA 
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d. (1) It is interesting to note how much simple the matter becomes wheki tho jmsitiohs 
of the plain-text and cipher-text sections are reversed, or, what amoimts to the same thing, 
when in encipherment the plain-text pairs are sought in the sections containing the mixed aJ^^liar 
bets, and their cipher equivalents are taken from the sections containing the normal alphabets. 
For example, referring to Fig. 23A, suppose that sections 3-4 be used as the source of the plain- 
text pairs, and sections 1-2 as the source of the cipher-text pairs. Then ONp=DG«, etc. 

(2) To solve a message enciphered in that manner, it is necessary merely to make a squue 
in which all four sections are normal alphabets, and then perform two steps, hirst, the dpher text 
pairs are converted into their normal alphabet equivalents merely by “deciphering'' the message 
with that square; the result of this operation yields two monoalphabets, one compost of the odd 
letters, the other of the even letters. The second step is to solve these two mono-alphabets.' 

(3) Where the same mixed alphabet is inserted in sections 3 and 4, the problem is still 
earner, since the letters resulting from the conversion into normal-alphabet equivalents all bdong 
to the same, single-mixed alphabet. 

46. Analysis of ciphers based upon other types of cheokerboard designs. — ^The solution 
of cryptognuns enciphered by other t3rpes of checkerboard designs is accomplished along lines 
very similar to those set forth in the foregoing example of the solution of a message prq>ared by 
means of a 4-square checkerboard design. There are, unfortunately, no means or tests which can 
be applied to determme in the early stages of the analysu exactly what type of design is involved 
in they&'sf case under study. The author freely admits that the solution outlined in subparagraph 
e is quite artificial in that nothing is demonstrated in step (7) that obviously leads to or warrants 
the assumption that a 4-square checkerboard is involved. This point was passed over with the 
quite bald statement that this was “the simplest thing to assume” — and then the solution 
proceeds exactly as though this mere hypothesis has been definitely established. For ex&mple, the 
very first results obtained were based upon assuming that a certain 44etter repetition represented 
the word STOP and immediately inserting certain letters in appropriate cells in a 4-^qvare checker^ 
board. Several more assumptions were built on top of that and very rapid ^des were made. 
What if it had not been a 4-square checkerboard at all? What if it had been a 2-square checker- 
board of the type shown in Fig. 24? 
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The only defense that can be made of what may seem to the student to be purely arbitrary 
procedure based upon the author’s advance information or knowledge is the following: In the 
first place, in order to avoid makmg the explanation a too-long-drawn-out affair, it is necessary 
(and pedagogical experience warrants) that certain fdtemative hypotheses be passed over in 
^ence. In the second place, it may now be added, after the principles and procedure have been 
elucidated (which at this stage is the primary object of this text) that if good results do not follow 
from a first hypothesis, the only. t.biTig the cryptanalyst can do is to reject that hypotheds, and 
formulate a second hypothesis. In actual practice he may have to reject a second, third, fourth, 
. . . nth hypothesis. In the end he may strike the right one — or he may not. There is no 
guaranty of success in the matter. In the third place, one of the objects of this text is to show 
how certain systems, if employed for military purposes, can readily be broken down. Assuming 
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that a checkerboard system is in use, and that daily changes in keywords are made, it is possible 
that the traffic of the first day might give considerable difficulty in solution, if the type of 
checkerboard were not known to the oryptanalyst. But the second or third day’s traffic would 
be easy to solve, because by that time the cryptanalytic personnd would have analyzed the 
system and thus learned what type of checkerboard the enemy is using. 

46.^ Analysis of tile Playfair cipher system. — a. An excellent example of a practical, partially 
digraphic system is the Playfair ciph^.' It was used for a number of years as a field cipher by 
the British Army, before and during the World War, and for a tiiort time, also during that 
war, by certain units of the American Expeditionary Forces. 

h. Published solutions^ for tbb dpher are quite similar bamcally and vary only in minor 
details. The earliest, that by lieut. Mauborgne, used straightforward principles of frequency to 
establish the values of three or four of the most frequent digraphs. Then, on the assumption 
that in most cases in which a keyword appears on the first and second rows the last five letters 
of the normal alphabet, WXYZ, will rarely be disturbed in sequence and will occupy the last row 
of the square, he “jug^es” the letters pven by the values tentatively established from frequency 
considerations, placing them in various positions in the square, together with VWXYZ, to correspond 
to the plain-text cipher relationships tentatively established. A later solution by Lieut. Frank 
Moorman, as described in Hitt’s Manual, assumes that in a Playfair cipher prepared by means 
of a square in whudi the keyword occupies the first and second rows, if a digraphic frequency 
distribution is inade, it will be found that the letters having the greatest combimng power are 
very probably letters of the key. A still later solution, by Lieut. Commander Smith, is perhaps 
the most lucid and systematized of the three. He sets forth in definite language certain con- 
siderations which the other two writers certainly entertained but failed to indicate. 

e. The following details have been summarized frmn Commander Smitii’s solution: 

(1) The Playfair dpher may be recognized by virtue of the fact tiiat it always contains an 
even number of letters, and that when divided into groups of two letters each, no group contains 
a repetition of the same letter, as NN or EE. Repetitions of digraphs, trigraphs, and pol 3 rgraphs 
will be evident in fairly long messages. 

(2) Using the square * shown in Fig. 25a, there are two general cases to bo consideibd, as 
regards the results of encipherment: 
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Itacu asa. 



* This olpber waa really Invented by Sir Charles Wheatstone but reoeiyes its name from Lord Playfair, 
who apparently waa its sponsor before the British Foreign Office. See Wemyss Beid, Memoin of Lyon Playfair, 
Inndon, 1899. A detailed description of this cipher will be found in Sec. VIII, Advanced Military Cryptography. 

^ Mauborgne, Lieut. J. 0., U. S. A. An advanced problem in cryptography and its solution, Leavenworth, 1914. 

Hitt, Captain Parker, TJ. S. A. . Manual for the solution of military ciphers, Leavenworth, 1918. 

Sinlth, Lieut. Commander W. W., U. S. N. In Cryptography by Andr4 Langie, translated by J. C. H. 
Macbeth, New York, 1922. 

* The Playfair square accompanying Commander Smith's solution is based upon the keyword BANKRUPTCY, 
"to be distribute between the first and fourth lines of the square." This is a simple departure from the original 
Playfair scheme in which the letters of the keyword are written from left to right and in consecutive lines from 
the top downward. 
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Cabx 1. Letters at opposite comers of a rectangle. The following illnstratiye rdationships 
are found: 

THjFYF. 

HT,r=FT. 

YF„pTO. 

FYjFHT, 

Reciprocity is complete. 

Case 2. Two letters in the same line or column. The following illustrative rdation^ps 
are found: 

AN,r=NK« 

NAjr^KN, 

But NKp does not=ANa, nor does KNp=NA.. 

Reciprocity is only partial. 

(3) The foregoing gives rise to the following: 

Rule I. (a) Regardlessof the position of the letters in the square, if 

1.2^3.4, then 
2.1=4.3 

(6) If 1 and 2 form opposite comers of a rectangle, the following equations obtain: 

1.2=3;4 

2.1=4.3 

3.4=1.2 

4.3=2.1 

(4) A letter considered as occupsdng a position in a row can be combined with but four other 
letters in the same row; the same letter considered as occupying a position in a column can be 
combined with but four other letters in the same column. Thus, this letter cim be Combined with 
only 8 other letters all told, xmder Case 2, above. But the same letter considered as occupying 
a comer of a rectangle can be combined with 16 other letters, under Case 1 , above. Commander 
Smith derives from these facts the conclusion that **it would appear that Case 1 is twice as prob- 
able as Case 2.” He continues thus (notation my own): 



'Now in the square, note that: 






ANifNK. 




ENifFA. 


GNjpFK, 




EMy^FL^ 


ONiFMK. 


cdso 


ETifFP. 


CNiFTK, 




EW|fFV® 


XNiPWK. 




EFiFFG, 



“From this it is seen that of the 24 equations that can be formed when each letter of the 
square is employed either as the initial or final letter of the groups five will indicate a repetition of 
a corresponding letter of plain text. 

“Hence, Rule II. After it has been determined, in the equation 1.2=3.4, that, say, ENp=FA(, 
there is a probability of one in five that any other group beginning with F, in^cates EOp, and that 
any group ending in Ap indicates 0Np. 
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“After Buch combinatioiis as ER,, ORpi and ENp have been assumed or determined, the above 
rule may be of use in discovering additional digraphs and partial words.” * 

Bulb III. In the equation 1.2 =3.4, 1 and 3 can never be identical, nor can 2 and 4 ever be 
identical. Thus, ANp could not possibly be represented by AY«, nor could ERp be represented by KR«. 
This rule is useful in elimination of certain posdbilities when a specific message is being studied. 

Bulb IV. In the equation 1.2p=3.4„ if 2 and 3 are identical, the letters are all in the same 
row or column, and in the relative order 124. In the square shown, ANp=NKe and the absolute 
order is ANK. The relative order 124 indudes five abmlute orders which are cyclic permutations 
of one another. Thus: ANK. ., NK. .A, K. .AN, . .ANK, and .ANK.. 

Bulb V. In the equation 1.2p=3.4„ if 1 and 4 are identical, the letters are all in the same 
row or column, and in the relative order 243. In the square shown, KNp=RK, and the absolute 
order is NKR. The relative order 243 includes five absolute orders which are cyclic permutations 
of one another. Thus NKR. ., KR. .N, R. .NK, . .NKR, and .NKR.. 

Bulb VI. “Analyze the message for group recurrences. . Select the grou^ of greatest 
recurrence and assume them to be high-frequency digraphs. Substitute the assumed digraphs 
throughout the message, testing the assumptions in their rdation to other groups of the cipher. 
The reconstruction of the square proceeds dmultaneously with the solution of the message and 
aids in hastening the translation of the cipher.” 

d. (1) When solutions for the Playfair dpher system were first developed, based upon the 
fact that the letters Wjsire inseriied in the cells in kq3rword<nu3Eed order, cryptographers thought 
it desirable to place stumbling blocks in the path of such solution by departing from strict, 
keyword-mixed order. Playfair squares of the latter type are designed as “modified Playfair 
squares.” One of Ihe simplest methods is illustrated in Fig. 25a, wherein it will be noted that 
the last five letters of the keyword proper are inserted in the fourth row of the square instead 
of the second, where they would natur^y fall. Another method is to insert the letters within 
the cells from left to right and top downward but use a sequence that is a keyword-mixed sequence 
deyeloped by a columnar transposition based upon the keyword proper. Thus, using the key- 
word BANKRUPTCY: 

2 1 5 4 7 9 6 8 310 
BAN K R U P T C Y 
D E F G H I L M 0 Q 
S V W X Z 

Sequence: A E V B D S C 0 K G X N F W P L R H Z T M U I Y Q 

' There is an error in this reasoning. Take, for example, the 24 equations having F as an initial letter: 
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CaM 




1. 


FB,=ON, 


2. FE=ED 


2. 


FT^ 


1. 


FXpcGff 


2. 


FD sEH 


1. FL*EN 


2. 
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FRpBM 
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FI »0H 


1. FP-ET 


1. 


Fi»ai 


2. 


FH=£G 
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FU =DT 


i. FV-«» 
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FG=BF 


1. 


FCHfii 


1. 


FS =DW 


2. FN=NW 


1. 


Fo=<ai 


1. 


Fy=HT, 


1. 


FA -EN 


2. FIMiF 


1. 


fo»Gt 


1. 


FZ>«lr 



Here, the initial letter F, represents the following initial letters of plain-text digraphs: 

0 E N G H 

It is seen that F. represents Dp, Np. Gp, Hp 4 times each, and Ep, 8 times. Consequently, supposing that it has 
been determined that FA,=ENp, the probability that F, will represent Ep is not 1 in 5 but 8 in 24, or 1 in 3; but 
supposing that it has been determined that FW.=NTp, the probability that F. will represent Np is 4 in 24 or 1 in 6. 
The difference in these probabilities is occasioned by the fact that the first Instance, FA«=ENp corresponds to a 
Case 1 encipherment, the second instance, Flf.^NTp, to a Case 2 encipherment. But there is no way of knowing 
initially, and without other data, whether one is dealing with a Case 1 or Case 2 encipherment. Only as an 
approximation, therefore, may one say that the probability of F, representing a given Op is 1 in 6. 
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The Hayf*ir Square is *8 follows: 
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(2) In the foregoing ^uare practically all indications that the square has been developod 
from a' keyword have disappeared. The principal disadvantage of such an arrangement is that 
it requires more time to locate the letters desired, both in cryptographing and decryptographing, 
than it usually does when a semblance of normal al|diabetic order is preserved in the square. 

(3) Note the following three squares: 
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At first glance th^ all appear to be different, but closer examination shows them to be eydUc 
permutationa of one another and of the square in Fig. 266. They yield identical equivalents in 
all cases. However, if an attempt be made to reconstruct the original keyword, it would be 
much easier to do so from Fig. 256 than from any of the others, because in Fig. 256 the kejrword- 
mixed sequence has not been disturbed as much as in Pigs. 26c, d, t. In working with Playfair 
ciphers, the student should be on the loolmut for such instances of cyclic permutation of the 
original Playfair square, for during the course of solution he will not know whether he is building 
up the original or an equivalent cyclic permutation of the original squuc; only after he has 
completely reconstructed the square will he be able to determine this point. 

(4) It can readily be shown that the columns of a Playfair square may be cyclically permuted 
(see subpar. d) to produce a first set of 25 squares all of which, though at first glance apparently 
different, will yield identical equivalents; likewise, the rows of such a square may be cyclically 
permuted to produce a second set of 25 squares all of which will also yield identical equivalents. 
Thus there may be a total of 50 cyclic permutations composed of two , sets of 25 each. The 
dphw equivalents yielded by Case 2 encipherments Getters in the same row or in the same 
column) will be identical for any two of these 50 different Playfair squares; hut the cipher equiv- 
alents 3 nelded by Case 1 encipherments Getters at diagonally opposite comers of a rectangle) 
will be identical only for two squares belon^ng to the same set of 25 cyclic permutations. 

(J) The steps in the solution of a t 3 rpical example of this cipher may be useful. Let 
the messa^ be as follows: 
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(2) Without goii)g through the preliimnaiy tests in deteilr with which it will. be assumed 
that the student is now familiar,^*’ the conclusion is roached that the cxyptogram b digraphid in 
nature, and a digraphic frequency dbtribution is made (lig. 26). 
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u See Far. 44c. 
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Since there are no double-letter groups, the conclusion is reached that a Playfair cipher 
is involTed and the message is rewritten in digraphs. 
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(3) The following three fairiy lengthy repetitions are noted: 
















Umi 
























‘ 








F. 


OT 


UZ 


FA 


CX 


XC 


PZ 


XH 


CY 


NO 






















► 


4 — 






















G. 


TE 


XH 


FA 


CX 


XC 


PZ 


XH 


YC 


TX 














A, 


FT 


CH 


XS 


_CA_ 


KT 


VT 


RA 


ZE 
















a 


DG 


XV 


XS 


_5A_ 


KT 


VT 


PK 


PU 
















G. 


TM 


SM 


XC 


PT 


OT 


CX 


OT 


TC 
















E. 


VG 


HB 


XC 


PT 


OT 


CX 


OT 


MI 


























•V— 

















The first long repetition, with the sequent reversed digraphs CX and XC immediately suggests 
the wqid BATTALION, sjdit up into -B AT TA LI ON and the sequence contaimng this repeti- 
tion in lines F and G becomes as follows: 



line F 


OX 


OT 


UZ 


FA 


CX 


XC 


PZ 


XH 


CY 


NO 


TY 










B 


AT 


TA 


LI 


ON 








line G 


YA 


TE 


XH 


FA 


CX 


XC 


PZ 


XH 


YC 


TX 


VL 








ON 


B 


AT 


TA 


LI 


ON 









(4) Because of the frequent use of numerals before the word BATTALION and because of 
the appearance of ON before this word in line 6, the possibility suggests itself that the word 
before BATTALION in line O is either ONE or SECOND. The identical digraph FA in both cases 
gives a hint that the word BATTALION in line F may also be preceded by a muneral; if ONE is 
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correct in line 0, then THI^ is possible in line F. On the other hand, if SECOND is correct in 
line O, then THIRD is possible in line F. Thus: 



Line F. OX OT UZ FA CX XC PZ XH CY NO TY 

1st hypothesis- — TH RE EB AT TA LI ON 

2nd hypothesis. — TH IR DB AT TA LI ON 

Line G. YA TE XH FA CX XC PZ . XH YC TX WL 

1st hypothesis. ON EB AT TA LI 'ON 



2nd hypothesi8_. -S EC ON DB AT TA LI. ON 

First, note that if either hypothesis is true, then OT,=Ti^. The frequency distribution shows 
that OT occurs 6 times and is in fact the most frequent digraph in the message. Moreover, by 
Rule I of subparagraph b, if OTa=Tl^ then TOa=HTp. Since IfTp is a very rare digraph in normd 
plain text, TO, should either not occur at all in so short a message or else it should be very infre- 
quent. The frequency distribution shows its entire absence. Hence, there is nothing incon- 
sistent with the posnb^ty that the word in front of BATTALION in. line F is THREE 6r THIRD, 
and some evidence that it is actually one or the other. 

(6) But can evidence be found for the suj^ort of one hypothesis against the other? Let 
the frequency distribution be ezanuned mth a view to throwing light upon tins point: If the 
first hypothesis is true, then UZ,=REp, and, by Rule I, ZU.=ERp. The frequency distribution 
shows but one occurrence cl UZ, and but two occurrences of ZUg. These do npt look very good for 
RE and ER. On the other hand, if the second hypothesis is true, then UZ,=I^ and, by Rule I, 
ZUg=RIp. The frequendee are much more favorable m this case. Is there anything inconsistent 
wi& the assumption,' on the basis of the second hypothesis, that TE,i=EC,t? Tl^e frequency 
distribution aho^ys no inconsistency, for TE« occurs once and ETg (=CEp, by Rule I) .occurs once. 
As regards whether FAg=EBp or D^, both hypotheses are tenable; possibly the second hypothesis 
is a shade better than the first, on ^e follovring reasoning: By Rule I, if FA,=EBp then AFgr=BEpi 
or if FAg=DBp then AFg=BD^. The fact that no AFg occurs, whereas at least one BE^ may be 
dcpected in this message, inclines one to the second hypothesis, dnce BDp is very rare. 

(6) Let the 2nd hypothesis be assumed to be correct. The additional values are tentatively 
inserted in the text, and in lines G and E two interestihg repetitions are noted: 



line G TM SH XC FT OT. CX OT TC YA TE XH FA CX XC PZ XH 

TA TH AT TH EC ON DB AT TA LI ON 

line K. WG HB XC PT OT CX OT MI PY DN FG KI TC OL XU ET 

TA TH AT TH 



This certainly looks like STATE THAT THE. . . , whidi would make TEp=PT,. Fiurthermore, in 
line G the sequence STATETHATTHE. .SECONDBATTALION can hardly be anything dse tbau 
STATE THAT THEIR SECOND BATTALION, which would make TCg=EIp and Yk,=BSp. Also 

SMg=— Sp. 
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(7) It is perhaps high time that the whole list of tentative (equivalent values be studied in 
relation to their consistency with the positions of letters in the Playfair square; moreover, by so 
doing, additional values may be obtained in the process. The complete list of values is as follows: 



Auutnad paluu 


Derived by Rule I 


ATtf»CX, 


TAiPXC, 


LIi^PZ, 


ILV=ZP, 


0N^:XH. 


N0,r^. 


TH^OT, 


HTjFTO, 


IRiPUZ, 


RIjFZU, 


DBiPFA, 


BD,pAF, 


ECipTE, 


CEjpET. 


TE^PT, 


ETp=TP, 


EI^C. 


IEp=CT, 


RS^YA, 


SR^Y. 


-Sp=SM, 





(8) By Rule V, the equation Tf^=0T, means that H, T, and 0 are all in the same row or col- 
umn and in the relative order 2-4-^; amilarly, C,E,andTare in the same row or column and in the 
relatiye order 243. Further E, P, and T ate in the same row and column, and their relative 
order is also 243. That is, these sequences must occur in the square: 



(1) 

HTO. . . 


or 


(2) 

GET. . , 


or 


E T P 


(3) 

• • s 


or 


T 0 . . H , 


or , 


E T . . C , 


or 


T P . 


. E , 


or 


0 . . H T . 


or 


T . . C E , 


or 


P . . 


E T . 


or 


. .HTO. 


or 


. . GET, 


or 


. . E 


T P , 


or 


.HTO. 




.GET. 




. E T 


P . 





(9) Xoting the common letters E and T in the second and third sets of relative orders, these 
may be combined into one sequence of four letters. Only one pontbn remains to be filled and 
noting, in the £bt of equivalents that EIp=TC«, it is obvious that the letter I belongs to the GET 
sequence. The complete sequence is thc^ore as follows: 

C E T P I , or 
E T P I C . or 
T P I C E , or 
P I C E T , or 
I C E T P 

(10) Taking up the HTO sequence, it is noted, in. the list of equivalents that 0Np=XH«, an 
equation containing two of the three letters of the HTO sequence. From this it follows ^at 
N and X must belong to the same row or column as HTO. The arrangement must be one of the 
following: 

H T 0 X N 
T 0 X N H 
0 X N H T 
X N H T 0 
N H T 0 X 

(11) ^ce the sequence containing HTOXN has a common letter (T) with the sequence 
CETPI, it follows that if the HTOXN sequence occupies a row, then the CETPI sequence must 
occupy a column; or, if the HTO sequence occupies a coliunn, then the CETPI sequence must 
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occupy a row; and they may be combined by means of their common letter, T. According to 
Bubpar. d (4), the two sequences may be inserted within a Playfair square in 25 different ways 
by cyclically permuting and shifting the letters of one of these two sequences; and the same 
two sequences may be again inserted in another set of 25 ways by cyclically pennuting and 
shifting the letters of the other of these two sequences. In Ilg. 27 the diagrams labded (1) 
to (10), inclusive, show 10 of the possible 25 obtainable by making the HTQXN sequMice one 
of the rows of the square; diagrams (11) and (12) show 2 of the possible 26 obtainable by m airing 
the HTOXN sequence one of the columns of the square. The entire complement of 25 arrange- 
ments for each set may easily be drawn up by the student; space forbids their being completely 
set forth and it is really unnecessary to do so. 



(1) (2) (3) (4) 






14827«— 88 1 
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(12) Before trying to discover means whereby the actual or absolute arrangement may 
be detected from among the full set of 50 possible arrangements, the question may be raised: is 
it necessary? So far as concerns Case 2 encipherments, since any one of the 60 arrangements 
will yield the same equivalents as any of the remaining 49, perhaps a relative arrangement 
will do. 

(13) Let arrangement 8 be arbitrarily selected for trial. 







P 










I 










C 










E 






N 


H 


T 


0 


X 



riavstaso. 



(14) What additioiial letters can be inserted, using as a guide the list of equivalents in sub- 
paragraph (7)? There is ATp=CX,, for example. It contains only one letter. A, not in the 
arrangement selected for trial, and this letter may immediately be placed, as shown:*** 







P 










I 










C 




A 






£ 






N 


H 


T 


6 


X 



XiQnu2gft. 



Scanning the list for additional cases of this type, none are found. But seeing that several high- 
frequency letters have already been inserted in the square, perhaps reference to the cryptogram 
itself in connection with values derived from these inserted letters may ]^eld further dues. For 
i^xample, the vowels A, E, I, and 0 are all in position, as are tiie very frequent consonants N and T. 
The following combinations may be studied: 



ANi^eX. 


AT^CXo 


NA^e, ; 


TA,fXC. 


ENpFeT, 


ETTiFTP, 


NE,fT0, 


TEpF=PT, 


INi^T, 


ITp=CP, 


NipFie, 


TI^PC, 


on,Hxh. 


0T,^0, 


NOppHX, 


TOpfOX, 



ATp(=CXg), TAp(=XCo), 0Np(=XH,), TEp(=PTo) and ETp(=TPo) have already been inserted in the 
text. Of the others, only 0X,(=T0p) occurs two times, and this value can be at once inserted in 
the text. But can the equivalents of AN, EN, or IN be foimd from frequency considerations? 

" The fact that the placement of A yields ATp=CX, means that the outline selected for experiment really 
belongs to the correct set of 25 possible cyclic permutations, and that the letters of the NHTOX sequence bdong 
in a row, the letters of the PIC&T sequence belong in a column of the original Playfair square. If the reverse 
were the case, one could not obtain ATp=CX, but would obtain ATp=XC,. 
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Take Eilp, for example ; it is represented by 6T«. Wbat combination of OT is most likely to iepre* 
sent ENp among the following candidates: 

KTp (4 times) ; by Rule I, NEp would=TKe (no occurrences) 

VT, (5 times); by Rule I, NEp woiild=TVo (2 times) 

ZT, (3 times); by Rule I, NEp would =TZ, (1 time) 

VT, certainly looks good: it begins the message, suggesting the word ENEMY; in line H, in the 
sequence PZTV would become LINE. Let this be assmned to be correct, and let the word ENEMY 
also be assmned to be correct. Then EN^=QEa and the square then becomes as shown herewith: 







P 










I 










C 




A 


V 


M 


E 


Q 




N 


H 


T 


0 


X 



navune. 



(15) In line E is seen the following sequence: 

Line E: VT RK Mff CF ZU BH TV YA BG IP RZ KP CQ FN LV 

EN RI NE RE PT E 

The sequence . . .RI. .NERS. .PT. . . suggests PRIS0t^31S CAPTURED, as follows: 

MW CF ZU BH TV YA BG IP RZ KP 
P RI SO NE RS CA PT UR ED 

This gives the following new values: 6Pp= CF, ; SOp=BH, ; CAp=BG, ; URp=RZ, ; EI^=KP^. 

The letters B and G can be placed in position at once, since the positions of C and A are already 
known. The insertion of the letter B immediately permits the placement of the letter S, from the 
equation S0p=BH,. Of the remaining equations only EDp=KP, can be used. Since E and P are 
fiTad and are in the same column, D and K must he in the same column, and moreover the K must 
be in the same row as E. There is only one possible position for K, viz, immediately after Q. This 
automatically iSxes the position of D. The square is now as shown herewith: 







P 




D 






I 






G 


S 


C 


B 


A 


V 


M 


E 


Q 


K 


N 


H 


T 


0 


X 



riovBiasd. 
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(16) A review of all equations, including the very first ones established, gives the following 
which may now be used: DBp=FA,; RSp=YA,. The ^t permits the immediate placement of F; 
the second, by elimination of possible portions, permits the placement of both R and Y. The 
square is now as shown herewith: 







P 


F 


D 




Y 


I 




R 


G 


S 


C 


B 


A 


V 


H 


E 


Q 


K 


N 


H 


T 


0 


X 



Itovai Be. 



Once more a review is made of all remaining thus far unused equations. LIp=PZ, now permits 
the placement of L and Z. IRp=UZe now permits the placement of U, which is confirmed by the 
equation URp=RZ, from the word CAPTURED. 



L 




P 


F 


D 


Z 


Y 


I 


U 


R 


6 


S 


C 


B 


A 


y 


U 


E 


*Q 


K 


N 


H 


T 


0 


X 






There is then only one cell vacant, and it must be occupied by the only letter left unplaced, 
viz, W. Thus the whole square has been reconstructed, and the message can now be decryp to- 
graphed. ' ' 

(17) Is the square just reconstructed identical with the original, or is it a cyclic pehnuta- 
tion of a keyword-mixed Playfair square of the lype illustrated in Fig. 2567 Even though the 
message can be read with ease, this point is still of interest. Let the sequence be written in five 
ways, each composed of five partial sequences made by <ydicly permuting eaeh of the horusontal 
rows of the reconstructed square. Thiis: 





Kow 1 


Bow 2 


Bow 3 


Bow 4 


Bow 5 


(a) 


L W P F D 


Z Y I U R 


6 S C B A 


V M E Q K 


N H T 0 X 


(b) 


W P F D L 


Y I U R Z 


S C B A G 


M E Q K V 


H T 0 X N 


(c) 


P F D L W 


I U R Z Y 


C B A G S 


E Q K V H 


T 0 X N H 


(d) 


F D L W P 


U R Z Y I 


B A G S C 


Q K V H E 


0 X N H T 


(e) 


D L W P F 


R Z Y I U 


A G S C B 


K V M E Q 


X N H T 0 
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By experimenting with tiiese fire s^qutoces, in an endeavor to reconstruct a trani^sition 
rectangle conformable to a keyword sequence, the last sequence yields the following: 

P Y A C M N 
D F I G B E H 
L R U S K Q T 
W Z V X 0 

By shif ti n g the 0 from the last positi<Hi to the first, and rearranging the columns, the following 
is obtained; 

2536147 
COMPANY 
B D E F G H I 
KLQRSTU 
V W X Z 

The original square inust have been this: 



A 


G 


S 


C 


B 


K 


V 


H 


E 


Q 


X 


N 


H 


T 


0 


D 


L 


W 


P 


F 


R 


Z 


Y 


I 


U 



nann Xf 



/. Continued practice in the solution of Playfair ciphers will make the student quite expert 
in the matter and will enable , him to solve shorter and shorter messages." Also, with practice 
it will become a matter of indiSeTence to him as to whether the letters are inserted in the square 
with any sort of regularity, such as minple keyword-mixed order, columnar transposed keyword- 
mixed order, or in a purely random order. 

g. It may perhaps seem to the student that the foregoing steps are somewhat too artificial, 
a bit too "cut and dried'’ in thdr accuracy to portray the process of analysis, as it is applied in 
practice. For example, the critical student may well object to some of the assumptions and the 
reasoning in step (5) above, in which the words THREE and ONE (1st hypothesis) were rejected 
in favor of the words THIRD and SECOND (2nd hypothesis). This rested largely upon the 
rejection of RE^ and ERp as the equivalents of UZ, and ZUe, and the adoption of IRp and Rip as 
their equivalents. Indeed, if the student will examine the final message with a critical eye he 
win find that while the bit of reasoning in step (5) is perfectly logical, the assumption upon which 
it is based is in fact wrong, for it happens that in this case ERp occurs only once and REp does not 
occur at all. Consequently, although most of the reasoning which led to the rejection of the 1st 
hypothesis and the adoption of the 2nd was logical, it was in fact based upon erroneous assump- 

The author once had a student who “specialised” in Playfair eiphers and beoame so adept that he could 
solve messages containing as few as 60-60 letters within 30 minutes. 
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tion. In other words, despite the faot that the assumption was incorrect, a correct deduction 
was made. The stvdeni shovld take note that in eryptanaiyeis sitvaiions of this sort are not at all uvr 
usual. Indeed they are to be expected and a few words of explanation at this point may be useful. 

h. Cryptanalysis is a science in which deduction, based upon observational data, plays a 
veiy large role. But it is also true that in this science most of the deductions usually rest upon 
assumptions. It is most often the case that the cr 3 rptanalyst is forced to make his assumptions 
upon a quite limited amount of text. It cannot be expected that assumptions based upon 
statistical generalizations wiU always hold true when applied to data comparatively very much 
smaller in quantity than the total data used to derive the generalized rules. Consequently, as 
regards assumptions made in specific messages, most of the time they will be correct, but occa- 
sionally they will be incorrect. In cryptanalytis it is often foimd that among the correct deduc- 
tions there will be cases in which subsequently discovered facts do not bear out the assumptions 
on which the deduction was based. Indeed, it is sometimes true that if the/acts had been known 
before the deduction was made, this knowledge would have prevented maldTig the correct deduc- 
tion. For example, suppose the cryptanalyst had somehow or other divined that the message 
under consideration contained no RE, only one ER, one IR, and two RI’s (as is actually the case). 
He would certainly not have been able to choose between the words THREE and ONE (1st hypo- 
thesis) as against THIRD and SECOND (2d hypothesis). But because he assumes that &ere 
should be more ERp’s and REp’s than IR’s and RI’s in the message, he deduces that UZ, cannot 
be REp, rejects the 1st hypothesis and takes the 2d. It later turns out, after the problem has been 
solved, that the deduction was correct, although the assumption on which it was based (expectation 
of more frequent appearance of REp and ERp) was, in fact, not true in this particular case. The 
cryptanalyst can only hope that the number of times when his deductions are correct, even though 
based upon assumptions which later turn out to be erroneous, will abundantly exceed the num- 
ber of times when his deductions are wrong, even though based upon assumptions which later 
prove to be correct. If he is lucky, the malting of an assumption which is really not true will 
make no difference in the end and will not delay solution; but if he is specially favored with 
luck, it may actually help him solve the message — as was the case in this particular example. 

i. Another comment of a general nature may be made in connection with this specific 
example. The student may ask what would have been the procedure in this case if the message 
had not contained such a tell-tale repetition as the word BATTALION, which formed the point 
of departure for the solution, or, as it is often said, permitted an “entering wedge” to be driven 
into the message. The answer to his query is that if the word BATTALION had not been repeated, 
there would probably have been some o^er repetition which would have permitted the same 
sort of attack. If the student is looking for cut and dried, straight-forward, unvarying methods 
of attack, he should remember that cryptanalysis, while it may be considered a branch of mathe- 
matics, is not a science which has many “general solutions” such as are found and expected in 
mathematics proper. It is inherent in the very nature of cryptanalyti.es that, as a rule, only 
general principles can be established; their practical application must take advantage of peculi- 
arities and particular situations which are noted in specific messages. This is especially true in 
a text on the subject. The illustration of a general principle requires a specific example, and the 
latter must of necessity manifest characteristics which make it different from any other example. 
The word BATTALION was not purposely repeated in this example in order to make the demon- 
stration of solution easy; “it just happened that way.” In another example, some other entering 
wedge would have been found. The student can be expected to learn only the general principles 
which will enable him to take advantage of the specific characteristics manifested in specific cases. 
Here it is desired to illustrate the general principles of solving Fla 3 ^air ciphers and to point out 
the fact that entering wedges must can be found. The specific nature of the entering wedge 
yariea with specific examples. 
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CONCLUDING BEMABKS 

Pmgmidi 

Speolfd remarks oonceming the initial dassifioation of cryptograms ^ 47 

Olphers employing characters other than letters or figures 48 

Concluding remarks concerning monoalphabetic substitution 49 

Analytical key for cryptanalysis 50 

47. Special remarka oonoenung the initial dasaifloatioii of cryptograms. — a. The student 
should by this time have a good conception of the basic nature of monoalphabetic substitution and 
of the many “changes” which may be rung upon this simple tune. The first step of all, naturally, 
is to be able to classify a cryptogram properly and place it in either the transposition or the 
substitution class. The tests for this classification have been given and as a rule the student 
will encounter no difficulty in this respect. 

b. There are, however, certain kinds of cryptograms whose class cannot be determined in the 
usual manner, as outlined in Par. 13 of this text. First of all there is the type of code message 
which employs bona-fide dictionary words as code groups.' Naturally, a frequency distribution 
of such a message will approximate that for normal plain text. The appearance of the message, 
however, gives clear indications of what is involved. The study of such cases will be taken up in 
its proper place. At the momeidi it is only necessary to point out that these are code messages and 
not cipher, and it is for this reason that in Pars. 12 and 13 the words “cipher” and “cipher mes- 
sages” are used, the word “ciyptogram” being used only where technically correct. 

e. Secondly, there come the unusual and borderline cases, including cryptograms whose 
nature and type can nof be ascertained from frequency distributions. Here, the cryptograms axe 
technically not ciphers but special forms of disguised secret writings which are rarely susceptible 
of being classed as transposition or substitution. These include a large share of the cases wherein 
the cryptographic messages are disguised and carried under an external, irmocuous text which is 
innocent and seemingly without cryptographic content — ^for instance, in a message wherein 
specific letters are indicated in a way not open to suspicion under censorship, these letters being 
intended to constitute the letters of the cryptographic message and the other letters constituting 
“dununies.” Obviously, no amount of frequency tabulations will avail a competent, expert 
cryptanalyst m demonstrating or disclosing the presence of a cryptographic message, written and 
secreted within the “open” message, which serves but as an envelop and disguise for its authentic 
or real import. Certainly, such frequency tabulations can disclose the existence neither of sub- 
stitution nor transposition in these cases, since both forms are absent. Another very popular 
method that resembles the method mentioned above has for its basis a simple grille. The whole 
words forming the secret text are inserted within perforations cut in the paper and the remaining 
space filled carefully, using “nulls” and “diunmies”, making a seemingly innocuous, ordinary 
message. There are other methods of this general type which can obviously neither he detected 
nor cryptanalyzed, using the principles of frequency of recurrences and repetition. These can 
not be further discussed herein, but at a subsequent date a special text may be written for their 
handling.* 

‘ See Sec. XV, Elementary Military Cryptography. 

* The subparagraph which the student has just read (47c) contuns a hidden cryptographic message. With 
the hints given in Far. 3Se iet the student see if he can find it. 

( 99 ) 
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48. Ciphers emidoying characters other than letters or flgnres. — a. In view of the fore- 
going remarks, when so-called symbol dphers, that is, ciphers employing peciiliar symbols, 
signs of punctuation, diacritical marks, figures of ‘‘dancing men”, and so on are encountered 
in practical work nowadays, they are almost Certain to be simple, monoalphabetic ciphers. 
They are adequately described in romantic tales,’ in popular books on cr 3 rptography, and in 
the more common types of magazine articles. No further space need be given ciphers of this 
type in this text, not only because of their simplicity but also because they are encountered 
in military cryptography only in sporadic instances, principally in censorship activities. Even 
in the latter cases, it is usually found that such ciphers axe employed in ‘‘intimate” correspondence 
for the exchange of sentiments that appear less decorous when set forth in plain language. They 
are very seldom used by authentic enemy agents. When such a cipher is encountered nowadays 
it may practically always be r^aMed as the work of the veriest tyxo> when it is not that of a 
“crank” or a mentally-deranged person. 

h. The usual preliminary procedure in handling such cases, where the symbols may be some- 
what confusing to the mind because of their unfamiliar appearance to the eye, is to substitute 
letters for them consistently throughout the message and then treat the resulting text as an ordi- 
nary cryptogram composed of letters is treated. This procedure also fadlitates the construction 
of the necessary frequency distributions, which would be tedious to construct by using symbols. 

e. A final word must be said on the subject of symbol ciphers by way of caution. When 
symbols are used to replace letters, syllables, and entire words, tiien systems approach code 
methods in principle, and can become difficult of solution.’ The logical extension of the use of 
symbols in such a form of writing is the employment of arbitrary characters for a specially 
developed “shorthand” system bearing little or no resemblance to well-known, and therefore 
nonsecret, systems of shorthand, such as Gr^, Pitman, Uidess a oonsidwable amount 
of text is available for analsrsis, a privately-devised shorthand may be very difficult to solve. 
Portunatriy, such systems are rarely encountered in military cryptography. They fall under the 
heading of cryptographic curiosities, of interest to the cryptanalyst in his Irisure moments.’ 

d. In practical ciypto^phy today, as has been stated above, the use of characters other 
than the 26 letters of the English alphabet is comparatively rare. It is true that there are a 
few govermnents which still adhere to systems yielding cryptograms in groups of figures. These 
are almost in every case code systems and will be treated in their proper {dace. In some oases 
dpher systems, or systems of enriphering code are used which are basically mathematical in 
character and operation, and therefore use numbers instead of letters. Some persons are 
indfeed toward the use of numbers rather than letters because numbers lend themselves much 
more readily to certain arithmetical operations such as addition, subtraction, and so on, than 
do letters.' But there is usually added some final process whereby tiie figime groups are con- 
verted into letter groups, for the sake of economy in transmission. 

* The most famous: Poe’s The Gold Bug; Arthur Conan Doyle’s The Sign of Four. 

* The use of symbols for abbreviation and speed in writing goes baek to the days of antiquity. Cicero is 
reported to have drawn up “a book like a dictionary, in which he placed before eaeh word the notation (symbol) 
whlidi should represent it, and so great was the number of notations and words that whatever could be written in 
I<atin could be expressed in his notations.” 

* An exampie is found in the famous Pepys Diary, which was written in shorthand, pureiy for his own eyes 
by Samuel Pepys (1633-1703). *'He wrote it in Shelton’s system of tachygraphy (1641), which he complicated 
by usi ng foreign languages or by varieties of his own invention whenever he had to record passages least fit to be 
seen by his servants, or by ‘all the world.’ ” 

V But, this of course, is because we are taught arithmetic by using numbers, based upon the dednud system 
as a rule. By epedal training one could learn to perform the usual “arithmetical” operations udng letters. 
For example, using our English alphabet of 26 letters, where A=l, B=2, C=3, etc., it is obvious that A+B^C, 
just as 1+2=3; (A+B)*=I, etc. This sort of cryptographic arithmetic could be leartred by rote, just as 
multiplieation tddes are learned. 
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e. The only notable exceptions to the statement contained in the first ^tePce of 
ceding subparagraph are those of Russian messages transmitted in the Russian Morse alphabet 
and Japanese messages ta*ansmitted in the Kata Kana Morse alphabet. As regards Chinese, 
which is not an alphabetical language and comprises some 40,000 ideographs, since the Morse 
telegraph code comprises only some 40 combinations, tel^rams in Chinese are usually prepared 
by means of codes which permit of substituting arbitrarily-assigned code groups for the char- 
acters. Usually the code groups con^t of figures. One such code known as the Official Chinese 
Telegraph Code, has about 10,000 4-figure groups, beginning with 0001, and these are arranged 
so that there are 100 characters on each page. Sometimes, for purposes of secrecy or economy, 
these figure groups are enciphered and converted in letter groups. 

49 . Concluding remarks concerning monoalphabetic substitution. — a. The alert student will 
have by this time gathered that the solution of monoalphabetic substitution dphera of the simple 
or fixed t3rpe are particularly eai^ to solve, once the underlying prindples are thoroughly^ under- 
stood. As in other arts, continued practice with examples leads to fadlity and skiU in solution, 
especially where the student concentrates his attention upon traffic all of the same g«ieral nature, 
so that the type of text which he is continually encountering becomes familiar to bim and its 
peculiarities or characteristics of construction give clues for short cuts to solution. It is true 
that a knowledge of the general phraseology of messages, the kind of words used, their sequences, 
and so on, is of very great assistance in practical work in all fields of cryptanalysis. The student 
is urged to note particularly these finer details in the course of his study. 

h. Another thin^ whi^ the student should be on the lookout for in simple monoalphabetic 
substitution is the consecutive use of several different mixed cipher alphabets in a single long 
message. Obviously, a single, composite frequency distribution for the whole message will not 
show the characteristic crest and trough appearance of a simple monoalphabetic cipher, since a 
^ven dpher letter' will represent different plain-text letters in different parts of the menage. 
But if the cryptanalyst will carefully observe the distribution as it is being eomp&ed, he will 
note that at first it presents the characteristic crest and trou^ appearance of monodphabetidty, 
and that after a time it be^ns to lose this appearance. If possible he should be on the lookout 
for some peculiarity of grouping of letters which serves as an indicator for the shift from one 
cipher alphabet to the next. If he finds such an indicator he should begin a second distribution 
from that point on, and proceed until another shift or indicator is encountered. By thus isolating 
the different portions of the text, and restrictix^ the frequency distnbutions to the separate 
monoalphabets, the problem may be treated then as an ordinary simple monoalphabetio sub- 
stitution. Consideration of these remarks in connection with instances of this kind leads to the 
comment that it is often more advisable for the cryptanalyst to compile his own data, than to 
have the latter prepared by clerks, especially when studying a system de novo. For observations 
which will certainly escape an untrained clerk can be most useful and may indeed faeffitate 
solution. For example, in the case under consideration, if a clerk should merely hand the uni- 
literal distribution to tiie cryptanalyst, the latter might be led astray; the appearance of the 
composite distribution mi^t convince him that the cryptogram is a good deal more complicated 
than it really is. 

e. Monoalphabetic substitution with variants represents an extension of the basic principle, 
with the intention of masking the characteristic frequencies resulting from a strict monoalpha- 
beticity, by means of which solutions are rather readily obtained. Some of the subterfuges 
applied on the establishment of variant or multiple values are ample and more or less fail to 
serve the purpose for which they are intended; others, on the contrary, may interpose serious 
difficulties to a straightforward solution. But in no case may the problem be conadered of more 
than ordinary difficulty. Furthermore, it should be recognized that where these subterfuges 
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ore really adequate to the purpose, the complications introduced are such that the practical 
manipulation of the system becomes as difficult for the cryptographer as for the cryptanalyst. 

d. As already mentioned in monoalphabetic substitution with variants it is most common 
to employ figures or groups of figures. The reason for this is that the use of numerical groups 
seems more natural or eader to the uninitiated than does the use of varying combinations of 
letters. Moreover, it is easy to draw up cipher alphabets in which some of the letters are 
represented by single digits, others by pairs of digits. Thus, the decomposition of the cipher 
text which is an irregular intermixtme of uniliteral and multiliteral equivalents, is made more 
complicated and correspondingly diffictdt for the cryptanalyst, who does not known which 
di^ts are to be used separately, which in pairs. 

e. A few words may be added here in regard to a metliod which often suggests itself to lay- 
men, This consists in using a book possessed by all the correspondents and indicating the letters 
of the message by means of numbers referring to specific letters in the book. One way consists 
in selecting a certain page and then giving the line number and position of the letter in the line> 
the page number being shown by a single initial indicator. Another way is to use the entire 
book, giving the cipher equivalents in groups of three numbers representing page, line, and 
number of letter. (Ex.: 75-8-10 means page 75, 8th line, 10th letter in the line.) Such systems 
are, however, extremely cumbersome to use and, when the cryptographing is done carelessly, 
can be solved. The basis for solution in such cases rests upon the use of adjacent letters on the 
same line, the accidental repetitions of certain letters, and the occurrence of unenciphered words 
in the messages, when laziness or fatigue intervenes in the ciyptographing.' 

J. It may also be indicated that human nature and the fallibility of cipher clerks is such 
that it is rather rare for an encipherer to make full use of the complement of variants placed 
at his disposal. The result is that in most cases certain of the equivfdents will be used so much 
more often than others that div^mties in frequencies will soon manifest thranselves, affording 
important data for attack by the cryptanalyst. 

p. In the World War the cases where monoalphabetic substitution ciphers were mnployed 
m actual operations on the Western Front were exceedmgly rare because ihe majority of the 
beUigerents had a fair knowledge of cryptography. On the Eastern Front, however, the exten- 
sive use, by the poorly prepared Russian Army, of monoalphabetic ciphers in the fall of 1914 
was an important, if not the most important, factor in the success of the German operations 
during the Battle of Tannenberg.’ It seems that a somewhat more secure cipher system was 
authorized, but proved too difficult for the untrained Russian cryptographic and radio persoimel. 
Consequently, recourse was had to simple substitution ciphers, somewhat interspersed with 
plain text, and sometimes to messages completely in plain language. The damage which this 
faulty use of cryptography did to the Russian Army and thus to the Allied cause is incalculabl 

k. Many of the messages found by censors in letters sent by mail during the World War 
were cases of monoalphabetic substitution, disguised in various ways. 

’ In 1916 tiiie Gemian Government conspired with a group of Hindu revolutionaries to stir up a rebellion in 
India, the purpose being to cause the withdrawal of British troops from the Western Front. Hindu conspirators 
in the TTnited States were pven money to purchase arms and ammunition and to transport them to Indial. For 
co mmuni cation with their superiors in Berlin the conspirators used, among others, the system described in this 
paragraph. A 7-page typewritten letter, buOt up from page, line, and letter-number references to a book known 
only to the communicants, was intercepted by the British and turned over to the United States Government 
for use in connection with the prosecution of the Hindus for violating our neutrality. The author solved this 
message without the book in question, by taking full advantage of the clues referred to. 

* Gyldfo, Yves. Chiffm-bydemat IrutiUer I VOrldakriget Till Lands, Stockholm, 1931. A translation under 
the title 7%s Contr^vtum of tha Cryptographic Bureaus tn the World War, appeared in the Signal Corps Bulletin 
in seven sucMssive installments, from November-December 1933 to November-December 1934, inclusive. 

Nikdaieff, A. M. Secret Causes of German success on the Eastern Front. Coast Artillery Journal, September- 
Ootober, 1936. 
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60. Analytical key for oryptanalysiB. — a. It may be of asaistance to indicate, by means of an 
outline, the relatiqn^ps existing among the various cryptographic systems tiius far considered. 
This graphic outline will be augmented from time to time as the Cerent cipher systems are 
examined, and will constitute what has already been alluded to in Par. 6d and there termed an 
analytical key for cryptanalysis.* Fundamentally its nature is that of a schematic cUssification 
of the different systems examined. The analytical key forms an insert at the end of the book. 

b. Note, in the analytical key, the rather deaivcut, dichotomous method of treatment} that 
is, classification hy subdivision into pairs. For example, in the very first step there are only 
two alternatives: the cryptogram is either (1) cipher, or (2) code. If it is cipher, it is either 
(1) substitution, (2) transposition. If it is a substitution cipher, it is either (1) moiu^^i^phic, 
or (2) polygraphic — and so oh. If the student wiU study the analytical key- attentively, it will 
assist him in fixing in mind thh manner in which the various systems covered thus far are related 
to one another, and this will be of benefit in dearing away some of the mental fog or h&smess 
from which he is at first apt to suffer. 

c. The numbers in parentheses refer to specific paragraphs in this text, so that the student 
may readily turn to the text for detailed information or for purposes of refreshing his memory 
as to procedure. 

d. In addition to these reference numbers there have been affixed to the succesdve steps 
in the dichotomy, numbers that mark the “routes” on the cryptanalytic map (the analytical 
key) which the student cryptanalyst should follow if he wishes to facilitate his travels along the 
rather complicated and difficult road to success in cryptanalysis, in somewhat .the same way in 
which an intelligent motorist follows the routes indicated on a geographical map if he wishes to 
facilitate his travels along unfamiliar roads. The analogy is only partially valid, however. 
The motorist usually knows in advance the distant point which he desires to reach and he pro- 
ceeds thereto by the best and shortest route, which he finds by observing the route indications 
on a map and following the route markers on the road. Occasionally he encounters a detour 
but these are unexpected difficulties as a rule. Least of all does he anticipate any necessity for 
journeys down what may soon turn out to be blind alleys and “dead-end” streets, forcing him 
to double back on his way. Now the cryptanalyst also has a distant goal in mind — the solution 
of the cryptogram at hand — ^but he does not know at the outset of his journey the exact spot 
where it is located on the cryptanalytic map. The map contains many routes and he proceeds 

* This analytical key la quite analogous to the analytical keys usually found in the handbooks biologists 
commonly employ in the clastdfication and identification of living organisms. In fact, there are several points 
of resemblance between, for example, that branch of biology called taxonomic botany and cryptanalysis. In 
the former the first steps in the classificatory process are based upon observation of externally quite marked 
differences; as the process continues, the observational details become finer and finer, involving more and more 
difficulties as the work progresses. Towards the end of the work the botanical taxonomist may have to dissect 
the specimen smd study internal characteristics. The whole process is largely a matter of painstaking, accurate 
observation of data and drawing proper conclusions therefrom. Except for the fact that the botanical taxonomist 
depends almost entirely upon ocular observation of characteristics while the cryptanalyst in addition to observa- 
tion must use some statistics, the steps taken by the former are quite similar to those taken by the latter. It is 
only at the very end of the work that a significant dissimilarity between the two sciences arises. If the botanist 
makes a mistake in observation or deduction, he merely fails to identify the specimen correctly; he has an 
“answer” — but the answer is wrong. He may not be cognizant of the error; however, other more skillful botanists 
will find him out. But if the cryptanalyst makes a mistake in observation or deduction, he faUs to get any 
“answer” at all; he needs nobody to tell him he has failed. Further, there is one additional important point of 
difference. The botanist is studying a bit of Nature — and she does not consciously interpose obstacles, pitfalls, 
and dissimulations in the path of those trying to solve her mysteries. The cryptanalyst, on the other hand, is 
studying a piece of writing prepared with the express purpose of preventing its being read by any persons for 
whom it is not intended. The obstacles, pitfalls, and dissimulations are here consciously interposed by the one 
who ci^ptographed the message. These, of course, are what make cryptanalysis different and difificult. 
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to tost them one by one, in a sucoeanTe diain. He encounten many blind all^ and dead-end 
streets, which force him to retrace his steps; he makes many detours and jupaps many hurdles. 
Some of these retradngs of steps, doubling back on his tracks, jumping of hurdles, and detours 
are unavoidable, but a few are avoidable. If properly employed, the analytical key will help 
the careful student to avoid those wludi diould and can be avoided; if it does that much it will 
serve the principal purpose for which it is intended. 

4 . The analytical key may, however, smve another purpose of a somewhat different nature. 
When a multitude of cryptog^phic systems of diverse types must be filed in some ^tematic 
manner apart from the names of the correspondents or other reference data, or if in conducting 
instructi<mal activities dassificatory designations are desirable, the reference munbers on the 
analytical key may be made to serve as **1ype numbers." Thus, instead of stating that a given 
cryptogram is a keyword-systematically>mixed-uniliterai-monoalphabetio-monographio substitu- 
tion dpher one may say that it is a ^*1^e 901 cryptogram." 

y. The method of assigning type numbers is quite simple. If the student will examine the 
numbers he will note that successive levels in the dichotomy ate designated by successive hun- 
dreds. Thus, the first level, the classification into cipher and code is assigned the numbers 101 
and 102. On the second level, under dpher, the classification into monographic and polygraphic 
systems is assigned the numbers 201 and 202, etc. Numbers m the same hundreds apply 
therefore to systems at the same level in the classification. There is no particular virtue in th^ 
sdieme of assigning type numbers except that it provides fmr a considerable degree of expandon 
in future stmlies. 





REF ID:A64646 



APt^ENbll 

Table Ko. §aw 

1-A. Absolute frequencies of letters appearing in five sets of Government plain-text telegrams, each set 
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9^A. The 438 different digraphs of Table 6. Arranged first alphabetii^y according to their final letters 

and then according to their absolute frequencies 123 
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grams — 

-A. Arranged according to their absolute frequencies 129 

-B. Arranged first alphabetically according to their initial letters and then according to their absolute 

ficequencies— — ... 130 
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Bat NO. 1 


Set No. 8 


Sat No. S 


Sat No. 4 


Bat No. 1 


Lettsr 


Abaohita 

Fraqneoiiy 


letter 


Abaolata 

Reqmoer 


letter 


jtlMDhite 

BVqqimdq^ 


letter 


WitQ.uBBoy 


I#ttv 


Abaolnta 

Ymsarnsr 


A 


738 

104 

310 

387 

1,307 

263 

166 

810 

742 

18 

36 

306 

242 

786 

686 

241 

40 

760 

668 

086 

270 

163 

166 

43 

101 

14 


A 


783 

108 

800 

418 

1,294 

287 

176 

861 

760 

17 

38 

898 

240 

704 

770 

272 

22 

745 

683 

879 

233 

173 

168 

60 

166 

17 


A 


681 
08 
288 
428 
1, 292 
808 
161 
336 
787 
10 
22 
333 
238 
815 
791 
817 
45 
762 
686 
804 
312 
142 
186 
44 
179 
2 


A 


740 
88 
826 
461 
1, 270 
287 
167 
840 
700 
21 
21 
886 
249 
800 
756 
245 
88 
785 
628 
968 
247 
138 
188 
53 
218 
11 


* 


741 
99 
801 
448 
1, 276 
281 
150 
340 
607 
16 
81 
344 
268 
780 
762 
260 
80 
786 
604 
028 
238 
166 
182 
41 
220 
6 


R 


B 


R 


B 


B 


R 


C 


c_ 


C 


C 


n 


D 


n 


D 


D 


R 


R 


E 


R 


B 


9 


P 


R 


P 


■ P 


B 


e 


C 


B 


C 


H 


H 


H 


H 




T 


I 


T 


I 


T 


J 


J 


J 


J 


J 


K 


K 


K 




If 


I. 


L 


L 


T. 


T. 


U 


II 


u 


M.. . 


II 


N 


N 


N 


N 


B 


0 


0 .. 


0 


0 


O 


P 


P. 


P 


P 


■P 


q 


0 


q 


a. 


q 


R 




R 


R_. 


R 


s 


s 


s 


s 


S 


T 


T 


T 


T 


T._ _ 


u 


u. 


u 


U 


II 


V 


V 


V 


V 


V_ 


n 


w 


w 


B. 


W 


X 


X 


X 


X 


X 


V 


T 


Y 


T._ 


T 


7. 


7. 


z 


7 




Total 










10,000 




10,000 


10,000 


10^000 
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Tabls 2-A .‘ — AbsolvUfregueneiea of loters appearing in the combined five sets of messages totalling 

SOfiOO letters, arranged cdphdbeiieaUy 



A. 


3, 683 


G. 


819 


L. 


1,821 


Q..... 


.. 176 


V.„. 


766 


B. 


487 


H. 


1, 694 


11 


1,237 


R-.„. 


_ 3,788 


ff.... 


- . 780 


C. 


1, 534 


I. 


3, 676 


N. 


3, 976 


S..„. 


.. 3,058 


X.... 


231 


D. 


2, 122 


J. 


82 


0. 


3, 764 


T 


. 4,595 




967 


EL 


6.498 


K 


148 


P. 


1, 335 


U..... 


. 1,300 


2L_ 


49 



F. 1,416 



Table 1-B . — Absolute frequencies oj letters appearing in five sets of Oovemment plain4ext telegrams, 
each set containing 10,000 letters, arranged according to frequency 



Bet No. 1 


Bet No. 2 


Bet No. S 


Bet No. 4 


Bet No. S 


Letter 


Abedute 

Freqtwncy 


Letter 


Abedate 

Frequency 


Letter 


Abadnte 

Frequency 


Letter 


Absolute 

Frequency 


Letter 


Absolute 

Frequency 


S! 


1,867 

936 

786 

760 

742 

788 

685 

658 

887 

865 

819 

310 

270 

253 

242 

241 

191 

166 

166 

163 

104 

48 

40 

36 

18 

14 


F 


1, 294 
879 
794 
788 
770 
750 
745 
583 
413 
393 
851 
300 
287 
272 
240 
233 
175 
173 
163 
155 
103 
50 
88 
22 
17 
17 


R 


1, 292 
894 
815 
791 
787 
762 
681 
585 
423 
335 
833 
317 
812 
308 
288 
238 
179 
161 
142 
136 
98 
45 
44 
22 
10 
2 


E. 


1,270 

968 

800 

766 

740 

735 

700 

628 

451 

386 

849 

326 

287 

249 

247 

245 

218 

167 

133 

133 

83 

53 

38 

21 

21 

11 


E 


1,275 

928 

786 

780 

762 

741 

697 

604 

448 

349 

344 

301 

281 

268 

260 

238 

229 

182 

155 

160 

99 

41 

31 

30 

16 

5 


T 


T . .. 


T 


T 


T 


N 


N. 


M 


N 


R 


R 


A 


0 


n 


N_ 


T 


0 


T 


A 


0 


A 


T 




R 


A 


0 


R 


A 


I 


T.. 


S 


R 


S 


S 


S . .. 


D 


D 


D 


D . 


D 


f. 


I. 


H. 


L_. . .. 


H 


C _ 


H 


L 


R_ 


L 


H 


C 


P 


C, „ 


C 


II 


F 


u 


P 


F 


F 


P 


F 


II 


U 


H 


u 


C 


11 


P .. 


P. 


u 


u 


P 


u 


T 


G 


Y 


Y 


Y 


0. . 


V 


C 


G. 


W 


w 


w 


V 


V_ 


V 


V 


Y 


w. 


W . 


G 


B 


R 


B 


B 


R 


T 


r 


Q 


X 


X 


q 


K .. 


X 


q 


K. . . 


If 


Q. 


K 


K 


q 


j 


J 


J. 


J ... . _. 


.1 


7. 


7. 


7 


z 


7 












Total 


10, 000 


10, 000 


10, 000 


10, 000 


10, 000 















148274— S8 8 
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Table 1-C . — Absolute frequencies of vowels, high frequency consonants, medium frequency con- 
sonants, and low frequency consonants appearing in five sets of Government plainrtext tele- 
grams, each set containing 10,000 letters 



Set No. 


Vowels 


High Frequency 
ConBonants 


Medium Fn- 
quenoy Conso- 
nants 


Low Fnqusncy 
ConaonantB 


1 


3,993 


3,627 


2,329 


161 


2 


3,986 


3,414 


2,467 


144 


3 


4,042 


3,479 


2, 356 


123 


4 


3,926 


3, 672 


2,358 


144 


6 


3,942 


3,646 


2,389 


123 


Total » 


19,888 


17, 638 


11,889 


686 



> Qnnd total, S0,Q00. 



Table 2-B. — Absolute frequences of letters appearing in the combined five sets of messages totalling 

50,000 l^ers arranged according to frequencies 



E 


... 6,498 


I 


3, 676 


C.... 


1,534 


Y..... 


967 


X. 


231 


T 


... 4, 595 


S 


3,058 


F.... 


... . 1,416 


G... 


819 


P 


175 


N 


. 3 ’ 975 


D 


2, 122 
1, 821 


P.... 


1, 33.5 


W. _ 


780 


K 


148 


R 


... 3, 788 


T_. 


U.... 


L 300 


V 


766 


J 


82 


0 


. Z, 764 


H 


1 ’ 694 


M 


ij 237 


B 


487 


Z 


49 


A 


... 3,683 














.... i 



Table 2-C. — Absolute frequencies of vowels, high frequency consonants, medium fre^piency con- 
sonants, and low frequency consonants appearing in the combined fixe sets of messages totMiny 
50,000 Idters 

Vowds .... r-.r.... - ; 19, ,889 

High Frequency Consonants (D, N, R, S, and T) ... . 17, 538 

Medium Frequency Consonants (B, C, F, G, H, L, H, P, V, and W)....l....... -11, 88|9 

Low Frequency Consonants (J, K, Q, X, and Z)..i. 6® 

Total... . 1... ..1 ... 50, 000 
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Tablb! 2-D. — Abadvte frequencies 0/ liters as mUicd liters of 10,000 words found in Qonemmeni 

plain4ext telegrams 



I (1) ARRANGED ALPHABETICALLY 





A-- 


905 


G 


109 


L 


196 


ft.™ 


30 


V. 


77 




R 


287 


H 


272 


H 


384 


R... 


611 


W 


320 


j:' 


c 


664 


I 


344 


N. 


441 


S- 


965 


X. 


4 


t' 


. n 


525 


.1 


44 


0 


646 


T. 


1, 253 


Y 


88 


? 


E--.-. 


390 


K. 


23 


P 


433 


U 


122 


Z™. 


12 






855 


















1 


















Total 


10, 000 




1- ■■ 


(2) 


ARRANGED ACCORDING TO 


ABSOLUTE FREQUENCIES 




1 


T 


1, 253 


R. 


611 


IL. 


384 


T. 


196 


J ... 


44 


1 




965 


D 


525 


I ■ 


344 


U 


122 


Q 


30 


A 


905 


M 


441 


W._ 


320 


G_ 


109 


IC 


23 


1 


■p 


855 


P 


433 


B 


287 


Y 


88 


7. 


12 


r 


r. 


664 


E 


390 


H 


272 


V 


77 


X 


4 




- 


646 



















Totals 10,000 

.Table 2-%.— Absolute frequencies of letters as final letters of 10,000 words found in Oovemmeni 
i . plain4ext telegrams 



(1) ARRANGED ALPHABETICALLY 



A 


269 


G 


225 


L. 


354 


ft. 


8 


V 


4 


B. 


22 


R .... 


450 


M. 


154 


R. 


769 


W. 


45 


n 


86 


I 


22 


N 


872 


S. 


. 962 


X. 


116 


D. 


1,002 


J 


6 


0.. ... 


575 


T 


1,007 


Y. 


866 


E 


1, 628 


K 


53 


P 


213 


U 


31 


Z 


9 


F. 


252 








Total 


10,000 








(2) ARRANGED ACCORDING TO ABSOLUTE FREQUENCIES 



E. 


... 1,628 


R. 


769 


F. 


T 


... 1, 007 


0 


575 


G. 


D. 


... l’002 


R 


450 


P. 


S 


962 


L. 


354 


H. 


N 


872 


A 


269 


X. 


Y. 


866 









252 


C. 


86 


I 


22 


225 


K. 


53 


Z. 


9 


213 


W 


45 


ft. 


8 


154 


U. 


31 


J 


6 


116 


B. 


22 


V. 


4 



Total 10,000 



I 

i 

! 




H CO 90 as o c p M n p» 



REF ID : A64646 



112 



Tablb 3. — Bdaiioe frequencies cf letters appearing in 1,000 letters hosed upon Table 2-B 



(1) ARRANGED ALPHABETICALLY 



A 


73.66 


G. 


16. 38 


L. 


36. 42 


Q 


3. 60 


V. 


. 16. 32 


B 


9.74 


H 


33.88 


IL 


24. 74 


R. 


76. 76 


w. 


- 16.60 


fi 


30. 68 


T 


73. 62 


M 


79. 60 


s 


61. 16 


T 


4. 62 


D. 


42.44 


j 


1. 64 


0 


76. 28 


T 


91. 90 


Y 


.. 19. 34 


E 


129. 96 


K. 


2. 96 


P. 


26. 70 


U 


26.00 


7. 


. 98 


F. 


28. 32 
































Total 


1,000.00 






(2) ARRANGED 


ACCORDING TO FREQUENCY 






E 


129. 96 


T 


73. 62 


fi 


30. 68 


Y 


19. 34 


X 


4. 62 


T„. . 


91. 90 


S 


61. 16 


P 


28. 32 


R 


16. 38 


P 


3. 60 


N. 


79. 60 


D 


42.44 


P 


26. 70 


W 


16. 60 


K 


2. 96 


R. „ 


76. 76 


I. 


36. 42 


II 


26.00 


V 


16. 32 


.T 


1. 64 


0 


76. 28 


H 


33. 88 


H 


24. 74 


R 


9. 74 


7 


. 98 


A- 


73. 66 
































Total 


1,000.00 



(S) VOWELS 



73.66 
129. 96 
73.62 
76. 28 
26. 00 
19. 34 



TotaL.._.__. 397.76 



(4) HIGH-FREQUENCY 
CONSONANTS 

42. 44 

79. 60 

76.76 

61. 16 



(6) MEDIUM-FREQUENCY 
CONSONANTS 



B 9.74 

C 1 30. 68 

F 28.82 

G 16. 38 

H..„ 33. 88 

L 36. 42 

H 24. 74 

P 26. 70 

V 16. 32 

W 16. 60 



Total. 



237. 78 



(6) LOW-FREQUENCY 
CONSONANTS 



X.__ 4.62 

Q 3. 60 

K 2.96 

J 1 . 64 

Z . 98 



Total 13.70 



Total (3), (4), 

( 6 ), ( 6 ) 1 , 000.00 



Total - 360.76 
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Table 4. — Frequency diatriimHon for lOflOO lettere of literary Engliak, ae compiled /by Hitt} 

(1) ALPHABETICALLY ARRANGED 



A 


.778 


G.. 


174 


L 


372 


Q. 


8 


V 


112 


B. 


141 


H-. 


595 


M 


288 


R. 


651 


W._ . . 


176 


C 


296 


I.. 


.... . 667 


N. 


686 


S._ 


622 


X. 


27 


D 


402 


J-. 


51 


0 


807 


T 


855 


V 


196 


E. 


1, 277 


KL. 


74 


P. 


223 


u 


308 


7. 


17 


F 


’ 197 
























(2) ARRANGED ACCORDING TO FREQUENCY 






E 


1,277 


R.. 


651 


u 


308 


Y 


196 


E 


74 


T 


858 


S-. 


. ... 622 


n 


296 


W 


176 


.1 


51 


0. 


807 


R. 


696 


M 


288 


G . . 


174 


X. 


27 


A _ 


778 


D_ 


402 


P.... 


223 


B 


141 


7 


17 


N. . 


686 


L. 


372 


F. .... 


197 


V. ... 


112 


q 


8 


I. 


667 



















Table 5— Frequency dishibutUm for 10,000 Utters of teUgraphie English as compiled by Hitt 

(1) ALPHABETICALLY ARRANGED 



A 


813 


G... _. 


201 


L. 


392 


Q. 


38 


B. . . 


149 


H.. *... 


386 


M . . 


273 


R 


677 


C. 


306 


I. 


711 


N. 


718 


S 


656 


D 


417 


.T 


42 


0 . 


844 


T . 


634 


E. 


1,319 


K. 


88 


P. 


243 


u 


321 



V. 

w 

X 

y. 

z. 



136 

166 

61 

208 

6 



F 

E. 

0 . 

A. 

N. 

I 

R. 



206 

1,319 

844 

813 



(2) ARRANGED ACCORDING TO FREQHENCY 



S. . 

T. . 
D.. 



656 Ui_. 



718 L. 

711 H. 

677 



634 

417 

392 

386 



C 

M. 

P 

y. 



321 F 

306 G. 

273 W 

243 B 

208 V. 



205 


K. 


88 


201 


X_ 


51 


166 


J 


42 


149 


Q. 


38 


136 


Z. 


6 



* Hitt, Capt. Parker. Manual for the Solution of MHiiary Ciphers. 
Leavenworth, Kaneaa, 1916. 



Army Service Schools Press, Fort 
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Table 7-A. — The 4S8 d\ffere>ni digraphs of table 6 arranged according to their absolute frequencies 



EN 


111 


ER 


32 


OL 


19 


US 




12 


RE 


98 


RS 


31 


QT- 


19 


UT.-.-. 




12 


ER. 


87 


UR. 


31 


TS, 


19 


VI 




12 



NT 82 NI 30 WO 19 WA 12 

TH. 78 RI 30 BE. 18 FF. 11 



ON. 


77 


EL. 


29 


EP 


18 


PP. 


11 


IN 


75 


HT. 


28 


NO. 


18 


RR. 


11 


TE 


71 


LA 


28 


PR. 


18 


HE . . 


11 




64 


RO 


28 


AI 


17 


FT 


11 


OR 


64 


TA 


28 


HR 


17 


.RU 


11 


5?T 


63 






PO 


17 


YE 


11 


ED 


60 




*2, 496 


RD. 


17 


YS- 


11 


NE„ 


67 


T.T. 


27 


TR. 


17 


YO. 


10 


VE 


67 


AD 


27 


DO . . 


16 


FE. 


10 


ES. 


64 


DI 


27 


DT. 


15 


IF. 


10 


ND 


52 


El 


27 


IX. 


15 


LY. 


10 


TO 


60 


TR 


27 


QU. 


15 


MO 


10 


SE. 


49 


IT 


27 


SO. 


15 


SP- 


10 






NR 


27 


YT. . 


15 


YE. 


9 




‘1,249 


ME. 


26 


> 

0 

1 

1 

1 

1 

j 

1 

1 

1 

1 


14 


FR. 


9 


AT. 


47 


NA 


26 


AM. 


14 


IM. 


9 


TI 


46 


SH. 


26 


CH. 


14 


LD. 


9 


AR. 


44 


IV 


25 


CT. 


14 


MI.„. 


9 


EE 


42 


OE 


25 


EM. 


14 


NF. 


9 


RT 


42 


OM. 


25 


GE. 


14 


RC. 


9 


A«! 


41 


np 


25 


OS 


14 


RM _. 


9 


CO 


41 


NR 


24 


PA. 


14 


RY. 


9 


10 


41 


.RA 


24 


PL. 


13 


DD._. 


8 


TY 


41 


IT. 


23 


RP. 


13 


NN 


8 


FO 


40 


PE 


23 


SC 


13 


DP 


8 


FI 


39 


TR 


22 


WI 


13 


lA 


8 


RA. 


39 


RE 


22 


MM. 


13 


HU. 


8 


ET 


37 


UN. 


21 


DS 


13 


LT. 


8 


OU. 


37 


CA 


20 


AU. 


13 


MP. 


8 


T.E 


37 


EP 


20 


IE. 


13 


OR 


8 


M& 


36 


EV 


20 


LO 


13 


OR 


8 


TW 


36 


fM 


20 


- 




PT 


8 


EA. 


36 


HA 


20 


•3, 


746 


UG. 


8 


IS 


35 


HE. 


20 


AP 


12 


AV 


7 


SI 


34 


HR 


20 


DR 


12 


BY. 


7 


DE. 


33 


LI 


20 


EQ. 


12 


Cl 


7 


HI 


33 


SS __ 


19 


AY 


12 


EH 


7 


AL. 


32 


TT_ _ 


19 


EO 


12 


OA 


7 


CE. 


32 


IG. 


19 


OD 


12 


ER 


7 


DA 


32 


NC 


19 


SF... — 


12 


EX 


7 



■ Til* 18 dlgnpbs above this Une oompoee 28% of the total, 

> The (8 dlgrapba above tUa line oampoee 80% of the total. 

• The 117 dintaplii abqve Ihip Qua oompoee 78% of the total. 
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Table 7-A.- 


—I%e 438 dijffetevi digraphs oj taUe 6 arranged according to their absohite firegwn- 

ciea - — Continued 


Gk 


7 


sp___ 


6 


DV-_ _ 


8 


KT 


2 


IP __ 


7 


SR. 


6 


AA 


3 


LM 


2 


NU. 


7 


TL. 


5 


EU. 


3 


LR- 


2 


ov _ _ 


7 


TU. 


6 


OE 


3 


UI 


2 


RG 


7 


mt 


5 


YI. .. . . 


3 


T.V 


2 


RN- ... 


7 


AF. 


4 


FS- 


3 


LW- 


2 


TE. 


7 


BA 


4 


FU, 


3 


MR . 


2 


TN 


7 


BO 


4 


GN. 


3 


MT 


2 


XT 


7 


CK. 


4 


as 


3 


MU. 


2 


AB 


6 


CR. 


4 


HC 


3 


MY._ 


2 


AG 


6 


oy 


4 


HN. 


3 


NB 


2 


Rf. 


6 


DB 


4 


Lp 


3 


NK 


2 


no 


6 


DC 


4 


LC- 


3 


OG 


2 


YA 


6 


DN. 


4 


LF. 


3 


OK 


2 


GO 


6 


nw 


4 


LP 


3 


PF. 


2 


TP 


(T 


pp 


4 


MC-„ 


3 


RB 


2 


KE 


6 


EC 


4 


NP..... 


3 


.SG 


2 


TJ? 


6 


EY 


4 


NV. .. 


3' 


SL. 


2 


MR 


6 


GT 


4 


NW.. - 


3 


TP 


2 


PI.._ 


6 


R<? 


4 


OH 


3 


UP. 


2 


P$.„. 


6 


M.<? 


4 


AR . 


2 


WN 


2 


RP 


6 


NR 


4 


AK 


2 


XA 


2 


TC 


6 


NR 


4 


RT 


2 


XC ___ 


2 




6 


OB 


4 


BR. 


2 


XI .... 


2 


TM. 


6 


PM. . 


4 


BU. 


2 


XP- 


2 


m; 


6 


RW _ 


4 


DG 


2 


YB 


2 


VA 


® 


SN. 


4 


DR.L.. _ . 


2 


YL. 


2 


YN 


6 


SW- 


4 


DO 


2 


YR 


2 


GI. 


5 


ffH : 


4 


AO 


2 


ZE._ 


2 


DM. 


5 


YG 


4 


OY 


2 


GG 


1 


DP. 


s 


YD 


4 


PG 


2 


AJ 


1 


nri 


5 


YR 


4 


PL 


2 


BJ 


1 


01 


5 


PR 


3 


GC 


2 


BR.. 


1 


IWL 


5 


PU. 


3 


GF. 


2 


BS 


1 


UI. . .. 


5 


RH 


3 


GL. 


2 


BT._ 


1 


FA 


5 


SB 


3 


GP 


2 


CD. 


1 


GI 


5 


SM. 


3 


GU 


2 


CF. 


1 


GR 


5 


TB 


3 


HD 


2 


CR 


1 


HP 


5 


ItR 


3 


HM 


2 


CN 


1 


NT. . ■ 


5 


UG 


3 


IB... „ . . 


2 


CS 


1 


NM 


5 


im 


3 


IK. 


2 


cw. 


1 


NY 


5 


YP 


3 


IZ. 


2 


CY. 


1 


RL. 


5 


CC 


3 


JE. 


2 


DJ 


1 


RII 


5 


AV 


3 


.10 


2 


DY... .. 


1 


RV. 


5 


DL. 


3 


JU 


3 


EJ..._ 


1 
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Table 7-A. — The 1S8 different digraphs of taMe 6 arranged according to their absolute frequen- 
cies — Continued 



as: 


1 


HY _ 


1 


pn 


1 


WT. 


1 


no 


1 


JA. 


1 


PN- 


1 


WR- — 


1 


YU„ 


1 


KA .. 


1 


PV 


1 


WS. 


1 


R7. 


1 


KC _ 


1 


PW 


1 


RV 


1 


FD 


1 


KT. 


1 


py. 


1 


YD 


1 


FG- 


1 


KW 


1 


QH 


1 


YF 


1 


FIL 


1 


KS 


1 


OR. 


1 


XP. 


1 


FP 


1 


TI! 


1 


RJ„. 


1 


YH 


1 


FW 


1 


TM 


1 


RK 


1 


XN. 


1 


FY 


1 


TN 


1 


S3K 


1 


YO 


1 


CD 


1 


MD 


1 


sv. 


1 


YR 


1 


BJ 


1 


MF 


1 


5?Y 


1 


YS 


1 


BH 


. 1 


MH 


1 


TB 


1 


YB 


1 


BW 


1 


NJ._ 


1 


TO 


1 


YH 


1 


HR 


1 


NQ 


1 


T7. 


1 


YW 


1 


W. 


1 


fti 


1 


UF 


1 


7.A 


1 


HP 


1 


OY 


1 


IIV 


1 


7.T 


1 


HO 


1 


PB 


1 


VO 


1 






HW. 


1 


PC 


1 


VT- 


1 


Total 


. 6.000 



Table 7-B. — The 18 digraphs composing £5% of the digraphs in Table 6 arranged alphabeHeaUiy 

cuscording to their in^ial letters 

(1) AND ACCORDING TO THEIR FINAL (2) AND ACCORDING TO THEIR ABSOLUTE 

LETTERS FREQUENCIES 





! AN 


64 




77 


AN 


64 


ON 


77 




1 




OR. 


64 






OR. 


64 


1 


1| ED. 


60 


RE_ 


08 


^ 


_ 111 


RE. 


98 


i 


:i ENL 


111 






ER. 


_ 87 








i ER. 


87 


SEL 


49 


ED. 


60 


SE. 


49 


j 


!j FR 


54 


ST 


63 


ER . 


54 


RT 


63 


i 

[ 






TEL 


71 






TH. 


78 




IN_ 


75 


TH 


78 


IN 


76 


TE 


71 


E' 

f ■ 






TO. 


50 






TO. 


50 


'V 

k 


ND. 


_ 62 


VEL 


57 


NT. 


82 


VE. 


67 




NF 


57 






NE. 


67 






i 


NT 


82 


Total 


1,249 


ND. 


6? 


Total... 


... 1,249 



i. 
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Tamb 7 ^. — The 5S digraphs etmpodng 60% ef the BfiOOdigraphs oj Table 8, atranged edpkahOieaUy 

according to their inUitd letten 

(1) AND ACCORDING TO THEIR FINAL (2) AND ACCORDING TO THEIR ABSOLUTE 
LETTERS frequencies 



ALu.._ 


32 


AN 


64 


AR 


44 


AS 


41 


AT 


47 


(X 


32 


CO 


41 


HA 


32 


DB - 


38 


EA. 


3S 


EC. 


32 


EDj 


60 


EEi. . 


42 


BLj 


22 


1i!M 


111 


ER.- 


87 


ES. 


51 


ET^ 


37 


FI 


39 


FG- 


40 


HI 


33 


HT. 


28 


INL- 


75 


i6. 


41 


IS. 


35 


LA..^-..... 


28 


LEu 


37 



HA. 


36 


ND. 


52 


NR 


57 


NI 


30 


NT 


82 


ON 


77 


OR 


64 


OU. 


37 


RA 


30 


REL . 


98 


RI_ .. . 


30 


1 

i 

i 

i 

s' 


28 


RS 


31 


RT... 


42 


SB.. 


49 


SI.*- 


34 


ST . . _ 


68 


TA^ 


28 


TR 


71 


TIL 


78 


TL; 


48 


TO 


50 


TW. 


36 


TY. 


41 


UR. 


31 


VE. ... 


57 



Total. 



Alt 


64 


AT. 


47 


AR. 


44 


ASl 


41 


AL 


32 


CO 


41 


CE. 


32 


DE 


33 


DA 


32 


EN. 


111 


ER 


87 


E3) 


60 


BS 


54 


EEl 


42 


ET. 


37 


EA 


35 


EC 


32 


Si— .._ 


29 


FO. 


40 


FL 


39 


HI,__ 


33 


itr 


28 


IN. 


75 


10 


41 


IS. 


35 


LEL 


37 


LA 


28 



MA_ 


36 


NT 


82 


NE._ — 


57 


ND. .. 


52 


NI_. 


30 


(WL . 


77 


OR. 


64 


OU 


37 


RE ..... 


98 


RT. 


42 


RA 


39 


RSL 


31 


RL__ 


30 


Ra ..._ 


28 


ST.__ 


63 


SE. 


49 


an. . 


34 


TH... 


78 


TE. 


71 


TO. 


50 


TI. 


45 


TY._.. 


41 


TW. 


36 


TA 


28 


UR. 


31 


VE. 


57 



Total. 



2,495 



2,495 
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Tablx 7-D . — The 117 digraphs eornpoaing 75% of ihe 6,000 digraphs of Table 6, arranged alphas, • 

betically according to thew ini^ letters — — 



(1) AND ACCORDING TO THEIR FINAL LETTERS 

LO 13 



AC 


14 


AD. 


27 


AI 


17 


AL. 


32 


AM. 


14 


AN. 


64 


AR. 


44 


AS..._ .. - 


41 


AT. 


47 


AU. 


13 


BE. 


18 


CA. 


20 


CE: 


32 


CH. ... 


14 


CO. 


41 


CT. 


14 


DA. 


32 


DE. . 


33 


DI. 


27 


DO. 


16 


D3. 


18 


DT. 


15 


BA. 


36 


EC. 


32 


ED: . 


60 


ee:„ 


42 


EF. 


18 


El: 


27 


EL. 


29 


E^ ..... 


14 



EP. 20 

ER. 87 

ES. 64 

Et. 37 

EV. 20 

FL 39 

PO. 40 

GE. 14 

GH.__ 20 

HA. 20 

HE 20 

HI. 33 

HO, 20 

HR. 17 

HT. 28 

IC, 22 

IE. 13 

IG. 19 

IL. 23 

IN. 76 

10- 41 

IR. 27 

IS, 36 

IT- 27, 

IV- 26 

IX. 16 

LA. 28 

LEL 37 

LL 20 



UA. 36 

ME.. 26 

NiL ... 26 

NC. .... 19 • 

ND 52 

NE 67 

NG-. 27 

NI. 30 

NO. .'. 18 

NS 24 

NT. 82 

OF. . 28 

OL. .... 19 

OM. . 26 

ON. 77 

OP.: 26 

OR. 64 

OS. . ... 14 ’ 

OT. 19 

ob:. 3t 

PA. : _..... 14 

PEi 28- 

PO:. 17 

PR.:. 18’ 

QU. 16 

r. • 

RA. 39 

RTLl : 17' 



RI. 30 

RlO;: ..... 28 

RS:: ..... 31 

RT. .... 42 

Ski. ..... 24'^ 

SE. 49 

sm; ... 26 

SI ..... 34 

SO. 15 

SS. .... 19- 

ST: ..... 63 

TA 28. 

TE 71 

TH- 78. 

TI. . 46, 

TO. 60 

TR_ 17 

TSL. 10 

TT. 10! 

Tff- 36 

TY 41 

ISSt 21 

UR. 31 

v!b: ; w- 

WE- 22 

wip,: .... 19 

YtL—-— 16' 
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Tabls 7-D, Concluded . — The 117 digraphs comprising 76% of the 6,000 digraphs of Table 6, 
arranged cdphabetieally according to their initial liters 



(2) AND ACCORDING TO THEIR ABSOLUTE FREQUENCIES 



AN 


64 


AT. 


47 


ML 


44 


AS. 


41 


AL 


32 


AD. 


27 


AI 


17 


AC. 


14 


AM. 


14 


Atl. 


13 


BE. 


18 


CO. 


41 


CE. 


32 


Ok 


20 


CH.. 


14 


CT..._ 


14 


DN. _ 


33 


DA. 


32 


DI 


27 


DO. 


16 


DT. _ 


15 


DS. 


13 


EN. 


111 


ER 


87 


liT> 


60 


ES 


54 


EE 


42 


ET 


37 


EA 


35 


EC 


32 


EL. 


29 



El 27 

EP. 20 

EV. 20 

EF. 18 

EM. 14 

FO. 40 

FI 39 

GH. 20 

GE. 14 

HI ..... 33 

HT. 28 

HA. 20 

HE. 20 

HO. 20 

HR. 17 

IN. 75 

IO. 41 

IS 35 

IR. 27 

IT.. 27 

IV 25 

IL. 23 

IC. 22 

IG. 19 

IX. 15 

IE 13 

LE 37 

Ul 28 

LL 27 

LI 20 

LO.... 13 



MA 36 

ME. 26 

NT 82 

NE 57 

ND 52 

NI 30 

NG. 27 

NA. 26 

NS 24 

NC. 19 

NO 18 

ON. 77 

OIL 64 

OU. 37 

OF. 25 

Oil. 25 

OP. 25 

OL. 19 

or. 19 

OS. 14 

PE. 23 

PR. 18 

PO 17 

PA. 14 

QU. 15 

RE. 98 

RT. 42 

RA. 39 

RS 31 



RI 30 

RO. 28 

RD. 17 

ST. 63 

SE. 49 

SI 34 

SHL 26 

SA. 24 

SS. 19 

SO 15 

TH. 78 

TE. 71 

TO 50 

TI 45 

TY. 41 

TW. ... 36 

TA. 28 

TS 19 

TT. 19 

TR. 17 

UR. 31 

UN. 21 

VE. 57 

WE. 22 

WO 19 

YT. 15 



Total 3,745 



Table 7-E. — AU the 438 digraphs of Table 6, arranged first alphabetically according to their initial 
letters and then alphabetically according to their final letters. 

(SEE TABLE 6.— READ ACROSS THE ROWS) 
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Tablb 8. — The J}38 different digraphs of Table 6, arranged firet alphaAetieatty according to their 
initial letters, and then according to their absolute frequencies under each initial letter * 



AN. 


64 


AT. 


47 


AR. 


44 


AS 


41 


AL. 


32 


AD. 


27 


AI. 


17 


AC. 


14 


A1L.._ ... . 


14 


AU._ 


13 


AP. 


12 


AY. 


12 


AV 


7 


AB„ .. . 


6 


AG- 


6 


AF. 


4 


AA. 


3 


AW. 


3 


AH. 


2 


AK.__ .. . 


3 


AO 


2 


AE. 


1 


AJ 


1 


BE. 


18 


BY._ 


7 


BL. 


6 


BAl 


4 


BO 


4 


BI. . 


2 


BR. 


2 


BU. 


2 


BJ._ 


1 


BH 


1 


BS 


1 


BT. 


1 


CO...... 


41 


CE. 


32 


CA. 


20 


CH. 


14 



CT 


14 


Cl. ... 


7 


CL .._ 


5 


CK. 


4 


CR. . 


4 


CU. .. 


4 


CC- 


3 


CD. 


1 


CF . 


1 


CIL . 


1 


CN. 


1 


CS. 


1 


cw... . 


1 


CY. 


1 


DE. 


33 


DA. 


32 


DI 


27 


DO. 


16 


DT. 


15 


DS. 


13 


DR. 


12 


DD. 


8 


DF. 


8 


OIL 


5 


DP- 


5 


DU. 


6 


DB. 


4 


DC. 


4 


DN. 


4 


DW. 


4 


DL. 


3 


DV. 


3 


DG. 


2 


DH._ 


2 


DQ. 


2 


DJ. 


1 


DY. 


1 


EN 


111 


ER™ 


87 



ED. ... 


60 




54 


EE. 


42 


ET __ _ 


37 


EA. 


35 


EC. 


32 


EL. 


29 


El 


27 


EP. 


20 


EV. 


20 


EF 


18 


EIL 


14 


EO. 


12 


EQ. 


12 


Eli 


7 


EW. 


7 


EX. 


7 


EB 


4 


EG. 


4 


EY. 


4 


EU 


3 


E.T 


1 


EZ. 


1 


FO. 


40 


FI._ 


39 


PF 


11 


FT. 


11 


PE. . 


10 


FR. 


9 


FAL 


5 


FS. 


3 


FU. 


3 


FC. 


2 


FL 


2 


FD. 


1 


PG. 


1 


Fli 


1 


FP. 


1 


FW 


1 


PY. 


1 



Cai 20 

GE. 14 

GA 7 

GO. 6 

GI 5 

GR. 6 

GT. 4 

GM. 3 

GS. ..... 3 

GC. ..... 2 

GF. 2 

GL 2 

GP. 2 

GU. 2 

GD. 1 

GGl 1 

GJ 1 

GIL 1 

GWl.. 1 



Hi. 33 

HI 23 

HiL 20 

HE. _... 20 

HD- 20 

HR. 17 

HU. 8 

HP. .. 5 

HE. 4 

HC. 3 

HM: 3 

HD. 2 

HIL - 2 

HB. ... 1 

HL. 1 

HR :.. i 

HQ. 1 

HW. 1 

HR .... 1 



* For ammgement alphabetically fiiat under intial letters and then under final letters, see Table 6. 
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Tablb 8, Contd . — The 438 different digraphs qf TaHe 6, arranged Jiret alphabetiocdly aceording to 
their initial letters, and then eucording to their ahsdvtefreguerteies under eeteh inHud letter ‘ 



IN. 


76 


LI 


20 


10. 


41 


LO 


13 


IS 


36 


LY. 


10 


IR. _. 


27 


LD. 


9 


IT . 


27 


LT 


8 


IV 


26 


LS 


6 


IL. 


23 


LB. 


3 


IC. .... 


22 


LC. 


3 


IG. 


19 


LF. 


3 


IX 


15 


LP 


3 


IE 


13 


Ul 


2 


IF. 


10 


LR 


2 


lU 


9 


LU __ 


2 


TA 


8 


LV. 


2 


IP. 


7 


LW. 


2 


ID.__ 


6 


LG. 


1 






LH..^ 


1 


IB._.. __ 


2 


LN... 


1 


IK. 


2 






IZ. 


2 


MA . 


36 






ME 


26 


JE. 


2 


MM 


13 


JO 


2 


MO 


10 


JU 


2 


MT 


9 


JA. 


1 


HP 


8 






MR 


6 


KE. 


6 


MS _ _ _ 


4 


KI 


2 


HC. . 


8 


KA 


1 


MR 


2 


KC 


1 


MT._ 


2 


KL.__ 


1 


MU. 


2 


KN_ . >. 


1 


MY. 


2 


KS.._ 


1 


MO 


1 






MF. 


1 


LE. 


37 


MH. 


1 


LA. 


28 






LL. 


27 


NT. 


82 



NE. 67 OA 7 

ND. 62 OV 7 

NI 30 00 6 

NG 27 01 6 

NA. 26 OB 4 

NS. 24 OE. 8 

NO 19 OH. 3 

NO 18 OG. 2 

NF 9 OK. 2 

NN. 8 OY. 2 

NU 7 OJ 1 

NL. 6 OX. 1 

NU 6 

NY 6 PE. 23 

NH. 4 PR. 18 

NR. 4 PO. 17 

Iff. 8 PA. 14 

NV_^ 3 PL 13 

NN. 3 PP. 11 

NB. 2 PT. 8 

NK. 2 PI 6 

NJ 1 PS 6 

NQ. 1 PM. 4 

PH. 3 

ON 77 PU. 3 

OR. 64 PF. 2 

OU. 37 PB. 1 

OF 26 PC. 1 

OM. 26 PD. 1 

OP. 26 PN. 1 

OL 19 PV. 1 

OT 19 PW. 1 

OS. 14 PY. 1 

OD. 12 QU. 16 

OC. 8 OH. 1 

ON 8 OR. 1 




* For arrangement alphabetioally first under initial letters and tfien under final letters, see Table 6. 
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Table 8, Concluded . — The 4S8 different digraphe off Table 6, arranged ffirst alphabeticaUy according 
to their initial letters, and then according to their absolute ffregpiendes under each initial letter ^ 



RE... 


»8 


SR 


5 


US 


12 


XT 


2 


RT... 


42 


SN 


4 


UT 


12 


XP. 


2 


RA... 


»() 


SW 


4 


UE 


11 


xn 


1 


RS... 


SI 


SB 


3 


UG 


8 


XE 


1 


RI... 


30 


SM 


3 


UL 


6 


XP 


1 


RO... 


28 


SG. 


2 


UA 


5 


XH 


1 


RD. 


17 


SL 


2 


UI 


5 


XN 


1 


RP... 


13 


SK 


1 


UH 


5 


XO 


1 


RR... 


11 


SV 


1 


UB 


3 


XR 


1 


RC... 


0 


SY. .. .. 


1 


UC 


3 


XS 


1 


RM... 


9 






UD. 


3 






RY... 


. 0 


TH. 


78 


UP 


2 


YT. 


15 


RG... 


7 


TE 


71 


UP 


1 


YP 


11 


RN... 


7 


TO 


50 


UO 


1 


YS 


11 


RF... 


A 


TI. .... 


45 


UV 


1 


YO 


10 


RL... 


6 


TY 


41 






YE. . 


9 


RU. 


6 


TW. 


36 


VE 


67 


YA 


6 


RV._ 


5 


TA 


28 


VI 


12 


YN. 


6 


RW... 


4 


TS 


10 


VA 


6 


YC 


4 


RH... 


3 


TT 


19 


VO 


1 


YD 


4 


RB... 


2 


TR 


17 


VT 


1 


YR. 


4 


RJ... 


1 


TF 


7 






YI 


3 


RK... 


1 


TN. 


7 


RE 


22 


YP 


3 






TC _ . 


6 


WO 


19 


YB 


2 


ST... 


63 


TO 


6 


WI 


13 


YL 


• 2 


SE... 


49 


TIL 


6 


WA 


12 


Yll 


2 


SI... 


34 


TL 


5 


WH 


4 


YR 


1 


SH... 


26 


TU... . . 


5 


WN 


2 


YH. 


1 


SA... 


24 


TB 


3 


WL 


1 


YU 


1 


SS... 


19 


TP 


2 


WR 


1 


VW 


1 


SO... 


16 


TG. 


1 


WS 


1 






SC... 


13 


Tq 


1 


WY 


1 


7.E 


2 


SF... 


12 


TZ 


1 






ZA. .. 


1 


SU... 


11 






XT 


7 


7T 


1 


SP... 


10 


UR. 


31 


XA. 


2 






SD... 


5 


UN. 


21 


XC 


2 


Total 


5, 000 



* For arrangement alphabetically first under initial letters and then under final letters, see Table 6. 
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Tablb 9-A . — The different digraphs of ToMe 6, arranged Jirtt tdphahetieaUy according to their 

final Utters, and then according to their absolute fregyeneies 



WL 39 

MA 36 

EA 36 

DA ..... 32 

LA .... 28 

TA 28 

NA 26 

SA. 24 

CA ....... 20 

HA. ....... 20 

PA. ....... 14 

WA. _..... 12 

lA ....... 8 

GA. ...... 7 

OA- 7 

VA. ....... 6 

YA. 0 

FA- 6 

VA. 6 

BA. 4 

AA. 8 

XA. ... 2 

JA. ....... 1 

KA. ..... 1 

ZA 1 

AB ....... 6 

MB. ....... 6 

DB. ..... - 4 

EB ....... 4 

DB. ..... 4 

LB ....... 3 

SB. -3 

TB ...... 3 

VB .. 3 

IB ...... 2 

NB ..... 2 

RB 2 

YB ....... 2 

HB i 



EC 32 

IC 22 

NO 19 

AC 14 

SC. 13 

RC 9 

OC. 8 

TC 6 

DC. 4 

YC ...... 4 

CC. ....... 3 

HC. ... 3 

LC. .... 3 

MC ....... 3 

UC. ..... 3 

FC ... 2 

GC. 2 

XC. 2 

..... 1 

PC :. 1 

ED. 00 

KD. 52 

AD. ... 27 

RD. ..... .17 

OD. ...... 12 

LD. .... 9 

PD. 8 

ID. 6 

,!PD. ....... 6 

SD. 5 

YD. 4 

UD. ... 3 

HD. ... 2 

CD. 1 

FD. 1 

GD. ... i 

m. --.... i 

PD. i 



RE. 98 

TE. 71 

HE. 67 

VE. 67 

SE. 49 

EE- 42 

LE. 37 

DE. 33 

CE. 32 

ME. ... 26 

PE. 23 

WE. ... 22 

HEL 20 

BE. 18 

GE. 14 

IE. 13 

UE. . 11 

FE. 10 

YE. 9 

KE- 6 

OE. ;.... 3 

JE. 2 

ZE. .... 2 

AE. 1 

XE. 1 



OF 26 

EF. 18 

SF. 12 

FF. 11 

YF. 11 

IF.__. 10 

NF. 9 

DF 8 

TF. 7 

RF 6 

HF. 5 

AF. 4 



GF. 2 

PF. .-. 1 

CF. 2 

MF. 1 

UF. 1 

XF 1 

NG. 27 

IG. 19 

UG. . 8 

RG. 7 

AG. 6 

EG. 4 

DG. 2 

OG. 2 

SG. 2 

FG. 1 

GG. 1 

LG. 1 

TG. 1 

YG- 1 



TH. 78 

SH. 26 

GH. ... 20 

CH. ..... 14 

EHL 7 

NH. 4 

WH. 4 

OH. 3 

PH. 3 

RH. 3 

AH. 2 

DHL 2 

LH. ...... 1 

MH. 1 

XH. 1 
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Table 9-A, Contd . — The 4S8 different digraphs of Table 6, arranged find alphaheHealiy according 
to their Jmal Utters, and then according to their ahsolvte Jregvcndes 



TI 


45 


PI 


39 


SI 


34 


HI 


33 


NI 


30 


RI 


30 


DI 


27 


El...... 


27 


LI 


20 


AI 


17 


WI 


13 


VI 


12 


MI 


9 


Cl 


7 


PI 


6 


GI 


5 


01 


6 


UI 


5 


yi 


3 


BI 


2 


KI 


2 


XI 


2 


ZI 


1 


AJ 


1 


BJ 


1 


DJ 


1 


EJ 


1 


GJ 


1 


NJ. 


1 




1 


RJ. 


1 


CK 


4 


AK 


2 


IK 


2 


NK. __ 


2 


OK 


2 


RK. 


1 


SK. 


1 


AL 


32 


EL 


29 



T.I. 


27 


IL. . 


23 


OL. 


19 


PL. 


13 


BL 


6 


UL 


6 


CL 


5 


NL 


5 


RL.... . 


5 


TL 


5 


DL... 


3 


PL 


2 


GL. 


2 


SL. 


2 


YL. 


2 


HL. 


1 


KL. 


1 


WL. 


1 


0M.__ 


25 


AM. 


14 


EM. . 


14 


MM. 


13 


IM. 


9 


RM. .. . . 


9 


TM. 


6 


DM. _ 


5 


NM. 


6 


UM. 


5 


PM. .... 


4 


SiL 


3 


HM_ 


2 


LM 


2 


YM... 


2 


BM. 


1 


CM. 


1 


PM. 


1 


GM.. - 


1 


QM... - 


1 


EN 


111 


ONL 


77 


IN 


76 



AN. 


64 


UN. 


21 


NN. 


8 


RN. 


7 


TN. 


7 


YN. 


6 


DN. 


4 


SN. 


4 


GN. 


3 


HN. 


3 


WN. . 


2 


CN. 


1 


KN. 


1 


LN. 


1 


PN._ 


1 


XN..._ 


1 


TO 


50 


CO 


41 


10. 


41 


P0._ 


40 


RO 


28 


HO... 


20 


WO. 


19 


NO 


18 


PO 


17 


DO 


16 


SO. 


15 


LO 


13 


RO 


12 


MO. . 


10 


YO. 


10 


GO... .. 


6 


00. 


6 


BO... 


4 


AO 


2 


JO. 


2 


uo 


1 


VO 


1 


xo. 


1 


OP. 


26 


EP 


20 



RP 


13 


AP. 


12 


PP. 


11 


SP 


10 


IIP 


8 


IP 


7 


DP. 


6 


LP. _. 


3 


NP 


3 


YP 


3 


GP. 


2 


TP... 


2 


UP 


2 


XP____ _ _ 


2 


FP. 


1 


HP _ 


1 


EQ 


12 


DO. _ 


2 


HQ 


1 


HQ 


1 


Ti 


1 


ITR 


87 


OR 


64 


AR.. . . 


44 


UR_ 


31 


IR. 


27 


PR. 


18 


HR 


17 


TR._ 


17 





12 


RR 


11 


PR - 


9 


(2L 


5 


SR . 


5 


CR- 


4 


NR... 


4 


YR _ 


4 


BR._ 


2 


IR 


2 


piR 


2 


QR 


1 


WR.. 


1 


XR. 


1 
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TAlihn 0-A, Coftoluddd.— 7%e 4S8 digi^Apha «f -TeN» Bj 

according to their fined Mtiirif Md thm blieordinB id ilhtiir tldtaolvte frequencies 



RS 


54 


(re 


19 


Jll 


2 


' ;‘i% 


1 


AS 


41 


TT 


19 


LU 


2 


YW 


1 


IS 


56 


DT. 


16 


Mil 


2 




1 T 


RS 


5l 


VT 


16 


¥ll 


1 


J3f 


'1?6 


NS 


24 


CT-- 


14 






EX. 


7 


ss 


id 


UT 


12 


IV. 


26 




[ii 


TS 


19 


FT ___ 


11 


EV. 


20 




r 


OS 


14 


LT 


8 


AV 


7 






rts 


15 


PT 


8 


OV. 


7 


AY 


18 


IIS 


12 


XT 


7 


RV 


5 


fjir 


ID 


YS 


11 


GT 


4 


DV: ..... 


8 


RY 


9 


T «5 


6 


MT 


2 


NV- 


3 


BY 


7 


PS 


6 


s*r 


1 


liV 


2 


NV 


6 


MS 


4 


VT 


1 


PV 


1 


EY 


4 


MS 


4 






SV 


1 


MY 


2 


*« 


3 


OtI 


87 


tiv 


1 


bVi 


2 


fiS 


3 


QU. 


16 






GY 


1 


RS 


1 


Art 


13 


Tff _____ 


36 


BY 


1 


OS 


1 


Sir 


11 


m 


"'■t 


FY 


1 


KS 


1 


HH 


8 


EW. 


7 


HY 


1 


WS 


1 


NU, 


7 


DW 


4 


PY 


1 


Vs 


1 


rill 


6 


RW 


4 


SY 


1 






mi 


s 


Sff 


4 


WV 


1 


NT 


82 


Tll 


5 


AW 


3 






ST. 


68 


cu. 


4 


NW- 


3 


±2. 


2 


AT- 


47 


Su. 


3 


LW. 


2 


E!2- 


,1 


RT 


42 


Pii 


8 


cw. 


1 




1 


RT 


37 


Wl 


3 


w 


1 






MT. 


28 


Btl 


2 


GW 


1 


Total 


6, 6&b 


IT- 


27 


OtJ. 


2 


HW. 


1 
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Tablb 9-B .— 18 digraphs composing 86% of the 5fXX) digraphs of TahU 6, arranged alpha- 
betically according to their final Uders 



{!) AND 


ACCORDING 


TO THEIR 


INITIAL 


(2) AND ACCORDING TO THEIR ABSOLUTE 




LETTERS 






FREQUENCIES 




ED 


00 


IN 


76 


ED 


60 


IN. 


76 


ND 


62 


ON , 


77 


ND 


62 


AN- 


64 


NE- 


67 


TO. 


_ 60 


RE. 


98 


TO. 


60 


RE . . 


98 






TE.. 


71 






SE 


49 


ER. 


87 


NE........ 


67 


ER. 


87 


TE 


71 


OR. 


_ 64 


VE 


67 


OR. 


64 


VE 


67 






SE 


49 










ES 


... 64 






ES- 


5i 


TH. 


78 


NT. 


82 


TH. 


78 


NT. 


82 






ST- 


63 






ST- 


63 


AN 


64 






EN 


1 1 1 






EN. 


111 


’ Total— 


1,249 


ON. 


77 


Total 


1,249 



Table 9-C. — The 6S digraphs composing 60% of the 6,000 digraphs of fable 6, arranged 
alphabetically according to their final letters— ^ 



(1) AND ACCORDING TO THEIR INITIAL LETTERS 



pA 


32 


RE. 


98 


EN. 


111 


IS... 


35 


EA 


35 


SE 


49 


IN 


76 


RS. 


31 


T.A 


28 


TE 


71 


ON. _ _ 


77 






MA. 


36 


VE. 


67 






AT. 


47 


RA 


39 






nn 


41 


ET 


37 


TA. - 


28 


TH. 


78 


>0. 


40 


HT. 


28 










10 


41 


NT.. 


82 


En 


32 


FI 


39 


Rn 


9.R 


RT 


42 




HI._ 


33 


TO. 


60 


ST 


63 






NI .... „. 


30 










ED. 


60 


RI - 


30 










Nn 


52 


SI 


34 


AR . 


44 


OU. 


37 


















TI 


45 


ER 


87 






CE 


32 






OR. 


64 


TW. 


36 


DE. 


33 


AL. 


32 


UR. 


31 






EE 


42 


EL 


29 






TY..._ ... 


41 


LE. 


37 






AS 


41 






NE. 


67 


AN. 


64 


ES. 


54 


Total 


2,495 
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Table 9-C, Concluded . — Tht BS digraph eompotin^ B0% tf iht 6, (MX) digrapht <(f TdUe 6, 
arranged alphabetically according to ihieir final Utiere 



(2) AND ACCOBDINQ TO THEIR ABSOLUTE FREQUENCIES 



RA 


39 


LEL .. 


37 


ON 


n 


IS 


34 


MA. 


36 


DE. 


33 


IN. 


75 


RS 


31 


EA 


35 


CE 


32 


AN 


64 






DA. 


32 








t“ - 







LA.._ 


28 


TH. 


78 


TO 


50 


sf 


08 


TA- 


28 






CO. 


41 


AT- — 


47 






TI 


45 


10 


41 


RT 


.V'* 

42 


EC... 


32 


PI 


39 


PO 


40 


ET 


3Z 


ED. 


60 


Si 


34 


Rb 


28 


HT. 


A 

28 


ND. 


52 


HL 


33 














NI. 


30 


ER 


87 


ou 


37 


RE 


98 


RI 


30 


OR...... 


64 






TE 


71 






AR _ . 


44 


TW 


36 


NE 


57 


AL. 


32 


UR. 


31 






VE 


57 


El. 


29 






TY 


41 


SE. 


49 






ES. 


54 






EE 


42 


EM-. 


111 


AS... ... . 


41 


Total 


2. 


, Table 9-D.- 


—The 117 digraphs composing 76% of the 5,000 digraphs of Table 6, arranged 






alphabetically according to their final letters 


— ^ — 










(1) AND ACCORDINQ TO THEIR INITIAL LETTERS 






CA. 


20 


ND. 


52 


EF 


18 


SI 


34 


DA. 


32 


RD-. 


17 


OF..... 


25 


TI 


45 


EA. 


35 














HA. .. 


20 


BE. 


18 


IG. 


19 


AT. 


32 


LA. 


28 


CEl 


32 


NG. 


27 


ET. 


29 


UA. 


36 


DE. 


33 






IL. 


23 


NJL 


26 


EE. 


42 


CH. 


14 


T.T. 


27 


PA 


14 


GE. 


14 


GH..... .. 


20 


OT. 


19 


RA 


39 


HE. 


20 


SH. 


26 






SA 


24 


IE. 


13 


TH. 


78 






TA 


28 


LE. 


37 






A9 ...... 


14 






ME. 


26 


Al 


17 


EM. 


14 


AC._ 


14 


NE. 


57 


DI_ 


27 


OH 


25 


EC 


32 


PE. 


23 


EL-. 


27 






Tfi 


22 


RE 


98 


Ft 


39 


AN 


64 


NC 


19 


SE 


49 


HI 


33 


EN 


111 






TE. 


71 


LI 


20 


IN 


75 


AD. 


27 


VE 


57 


NT 


30 


ON 


77 


ED. 


60 


WE 


22 


RI 


30 


UN 


21 
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TabiiK d-D, Oftfttd.-- 



-Tki 117 dis^apka eonpoeing 75% of tho 5,000 digrapho of Table 6, drtatbged 
alpkabUtkally doeording to their fituU — 



(1) AND ACCOtlDIN(J TO THEtR INITIAL LRTTERS— Cofitliltted 
il AR. 44 OS 14 vt-. 



CO . 




4l 


AR 


44 


OS 


14 


DO 




16 


TR. 


17 


iS. 


35 


pn 




40 


UR 


3i 


RS. 


31 


ho 




20 


ER. 


87 






to 




41 


OR. 


64 


AT. 


47 


LO 




is 


Pil„ 


18 


CT. 


14 


KO 




18 


HR. . 


17 


DT. 


15 


PO. 




17 


11 


27 


ET. 


37 


ftO 




28 






HT. 


28 


so 




15 


AS. „. 


41 


IT. 


27 


To 




sd 


ss 


19 


hr 


82 


WO.. 




19 


TS. 


19 


OT. 


19 








DS 


13 


RT. 


42 


pp 




20 


ES. 


84 


ST. 


63 


OP.. 




25 


NS. 


24 


¥T. 


19 






(2) AND ACCORDING TO THEIR ABSOLUTE FREQU 


RA 




39 


TE. „. 


71 


TH. 


78 


MA. 




36 


RE..... 


57 


SHI .... 


26 


EA. 




35 


VP. 


57 


tih... 


20 


DA.. 




32 


SE. ...... 


49 


CH. .... 


14 



AU. 

ou. 

QU. 



Total 3,745 



EN. 

ON. 

IN. 

AN. 

UN. 

TO- 

CO 

to. 

pb 

ftO. 

«0 

wo. 

n6. — : — 

PO 

DO 

SO 

LO 
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Tabl» 9-P, Concluded , ^17 digtv(pi* 

arranged alphabeHcaUy ofeor^iaijf (9 v< 



(2) AND ACCORDING TO THEIR ABSOLUTE! EREQUBKrCi'iS— Con^i^^ 



OP 


26 


ES 


64 


KP 


20 


AS 


41 






IS 


36 






RS 


31 


ER. 


87 


NS 


24 


OR. 


64 


SS 


19 


AR. 


44 


TS 


19 


UR. 


31 


OS 


14 


IR. 


27 


DS 


13 


PR. 


18 






HR. 


17 


NT.... 


82 


TR... 


17 


5T. 


63 



AT 


47 


qu 


■ u 


RT 


42 


AU 


ia 


ET. .. 
HT 


37' 

28 

27 


IV. ... 


28 


TT 


EV. 


20 


or. .... 

TT 


19 ' 

19 

16' 


TW. 


. . . ' 

36 


DT. . 


IX. 




YT. 

CT 


16 

14 


TV 


r* •; '-f 


OU 






Total 


- ’- M 



Table 9-E . — AU the 498 different di^phs 0/ TaMe 6 arranged alphabetusaUy Jirst oeeorMi^j^ 
thAr fined letters and then according to th^r initial letters ' ' 



(SEE TABLE U,— READ DOWN T^ CJOLUMN® . 

, . , i)!. ; . 

Tablb 10-A . — The 56 trigraphs appearing 100 or more tunes in the BO/jClO letters of Qoverningii\4 
plaiurdeat telegrams arranged according to their: absolute fregaeneies 





669 


TOM 


260 


AND. 


228 


ING. 


226 


IVE 


226 


TIO. 


221 


POR 


218 


OUR 


211 


THT. 


2U 


ONE 


210 


MIN 


207 


5?TO 


202 


EEN 


196 


GHT 


- 196 


TNE 


102 


VEN. 


190 


EVE 


l77 


EST. 


J76 



’to?.. 174' 

NTH. in' 

TWE. 170 

TWO. 163 

ATI 160 

THR. 168 

NTY. 157 

HRE. 163 

WEN- 153 

FOU. 162 

QRT. 146 

REE 146 

SilX. 146 

A5H- 143 

DAS. 140 

IGH. 140 

ERE 133 

COML 136 



EIG. 

m 

MEN 




-- 136 
— 136 
.. I3l 


«SEV 




.... 131 


E®S - 




— . 126 


inib 




.... 126 


NET 




118 


PER 




...^ 116 


3XA- 

terL _ 




-- m.. 

116 


EOa 14A 


Rfiax 

TEDl 

mi 

HIR. 

TRT 




-- U3 
1 

... 109 
... 109 
... 109 


DER 




... 101 


im — 




.- 100 
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Tablb 10-B. The 66 trigraphe appearing 100 or more Hmea in the 60,000 letters of Government 
plainr4ext telegrams arranged first alphabeticaUy according to their initial letters and then 
according to their ahsolvie freguencies 



AND. 


228 


GHT. 


106 


REE..... 




146 


ATI 


160 






RED..... 




- 113 


ASH 


143 


HRE _ 


153 








ATE. 


13S 


HER. 


106 


.QTn 




ono 








SIX..... 




__ 140 


coil. 


136 


ION. 


260 


SEV 




1 ai 






ING. 


226 


STA 




1 1 0 


DAS 


140 


IVE 


225 








DER 


101 


TNE 


192 


TTO 




221 


DRE 


100 


TCM 




140 


THT 




211 






IRT 




ENT 


Sfifl 


105 


TEE 




174 






TOP 




174 


EEN 


100 


MEN 


131 




TWE. 




- 1 70 


IfTVE 


177 










TWO...:. 




163 


E.^ 


170 


NTN 


207 




ERE 


13R 


NTH 


TI®... 




_ 108 


171 


TER. 




115 


ETB 


130 


NTT 


157 




ERS _ 


_ 120 


NET 


TEO 




112 


118 








EQII 


114 










ERT 


100 


OITR 


211 


UND..... 




125 






ONE 


210 








EOR 


21ft 


ORT 


146 


VEN..„. 




190 


EOU 


102 








" 


FiV. 


135 


PER. 


115 


WEN..... 




153 



Tablb 10-C . — The 66 trigraphe appearing 100 or more times in the 60,000 letters of QovernnmX 



plaivrtext telegrams arranged first alphabetically according to their central letters and then 
according to their absolute frequencies 

DASL 140 DER^ 101 HIR_ „„ _ 106 


EEN 


196 


TGH 


140 


ENT. 


009 


VEN 


190 






AND 


228 


■i«E 


174 


THT 


211 


ING 


226 


wraj 


103 


GHT 


100 


ONE 


210 


reE 


140 


TOR 


108 


TNE 


192 


MEN 


1,31 






UND. 


125 


.SEV 


131 


TIO. 


221 






NET .. 


118 






TON 


260 


PER_ 


115 


NTH 


207 


FOR 


218 


TER 


no 


STT 


146 


TOP. 


174 


RED 


113 


ETG 


130 


FOU 


152 


TED. 


112 


FIV 


135 


COM. 


136 
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Table 10^, Concluded . — The 56 tripkime uppea^^ tndri titnea in the 60,600 

Oovemment plain4ext tele^fns armngid ^st idphe^iHbaUy aeeordnm^ id Omr central lettere 
and then (leeording to their ahaolvie frequencies ■'■ ■■■ 



EQU 


114 


DRE 


inn 


.«?TA 


11.5 


tIDT? 




EST. 


176 


OUR 


ail 


ORT. 

ITRF 


xDo 

. . 14ft 

1^8 


A5?H 


14.3 


tihfi.... ..... 


993 


.<5Tn 


2 ns 


NTH 


i7i’ 


EVE. 


17lr 


ERS 


... 12ft 


ATT 


lAn 


TTrtl. 


tV() 


ERT 


10'S 


liTY 


T57 


TRT 


- lOft 


ATE 


13ft 


lira 


. ::ii3 


Table 10-D. 


— The 68 irigraphs appearing 100 or more times in the 50 'fl00 letters of OoBemmefit 


plain-text telegratns arranged fired alphabetically accord^ to their findl'letters and then aecoidlng 


to thetr aisolutefreg^ueneies 










STA 


UK 


TRH 


i4n 


TER 


, HR 










HIR. ... 


108 


AND . 


228 


THT 


211 


DER 


161 


imn 


125 


ATT 


• IftO 






RED. 


113 


ERI 


109 


DA.R 


140 


TED. 


112 






ER.R 


12A 




. .7 


COIL 


136 




• ! 


IVE. 


_ 225 






ENT 


rAo 


ONE. 


210 


ION. 


260 




1_QA 


INE 


192 


NIN. 


207 


CSnT3r.pai-.am 

is'Q'p 


XvO 


EVE. 


177 


EEN. 


196 


nD«ii 




TEE. 


174 


VEN. 


190 


Vml.. a.... 

MITT 


lift 


TWE._ . 


170 


WEN. 


153 


Tom 


119 


HRE 


l.«i3 


MEN 


131 


XiXia...... 


lUO 


REE 


14fi 


TTO 


221 






ERE 


138 


5?TO 


202 


FCU,„. 


152 


ATE 


135 


TWO 


103 


EQU. 


114 


DRE. 


100 














TOP. 


174 


FIV. - 


135 


TNn 


99A 






SEV. 


131 


ETR 


13R 


FOR 


218 










OIIR 


21 1 


SIX. 


146 


NTH. .... 


171 


THR 


IRS 






ASH. 


143 


PE3L 


115 


NTY. 


157 
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m 



H Mmgrapha ^ppearittg 50 or more times in tho 60,000 letters of Otmmmeni 
phin4eft telegrams arranged acearding to their aisalviefregwneies 



TTON 


5!1« 


THTR 


104 


ASHT. 


64 


EVEN 




pent 


1P3 


hund. 


64 


•PREN 


TfiS 


REQH 


93 


DRIED 


63 


ENTY 


1(T1 


HTRT 


97 


RIOD 


63 


STOP 


164. 


POJp 


03 


IVED. 


62 


WENT 


Ifia 


omES 


R7 


BNTR 


62 


MTME 


163 


UEST. 


87 


EPIC 


62 


Twmj 


162 


EQU1? 


86 


ITROM 


69 


THRE 


149 


l^RE. 


77 


TRTY 


69 


fOiJR 


144 


nMMA 


71 


RTl^ 


69 


IGHT 


140 


Lt,AR 


_ T\ 


UNDR. 


69 


ETVE 


1 36 


OT.T.A 


70 


NAUR 


66 




134 


vent 


70 


QUBT 


66 




1 32 


DOLLl....^ 


68 


UGHT. 


56 


nA«?H 


132 


T.ARS 


68 


S^AT^-r 


54 


SEVE 


121 


THTR 


68 


AflRH 


62 


TENTH 


114 


P1WT 


67 


RENT 


62 


WENT. 


111 


ERIO 


66 


FICE. 


50 



Table 11-B. — 2%e 64 tetragraphs a^f earing 60 or more times in the^ 60,000 tetters of Qovsrnr- 
ment plain4ext telegrgms, arrang^ first alphabetically goeording to their initial Utters, and then 
atxording to their ahsolnte frequencies 



ASHT 


64 


HREE 


134 


AI^ 


69 


HTRT 


97 






HUl®. 


64 


CCWM. 


....... 93 









. 63 


IfiMT 


140 






IVED. 


.......... 62 


D^.RH 


139 


TRTY 


69 


DOlii. 


68 






TIRED 




TJAR 


71 






LARS. 


68 


EVEN. 


168 






iniiTV 




MIENT 


111 




V3? 






TENTH 


114 


NINE. 


163 


EENT 


102 


NDRE 


77 


eqUe 


36 


NAUR 


66 


ERIO. 


66 






lawns 


62 


OIIMA 


71 






OLLA- 


70 


FOUR 


. 144 


nilRT 


66 


FIVE. 


135 






PPTC 


62 


PTERT 


67 


FROM. 


69 






FICE. 


50 


QUES. 


87 



REQU. 


98 


RIOD.. 


63 







St4p„.-..- 164- 

: 131 

STAT-- A4. 



rroN 


213 


TEiaN 


169. 


TWT3N 


16« 


THRE 


149 


THIR. 


.. 104 



THIS. ... 



UEST... - 

UNDR. 69 

UGHT. 66 



VENT. 70 

WENT. 163 
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Tablb 11-0.-^ 7?^ 64 tetragrojpk^ in th4 BQflQO lft4^g_,qf Ckwprnimi^ 

'plainA&ii teUgroma armnged Jrat Qlphabetic<diy aw>r4ivig . to th^ awmd kthro and then 
OMOrding to their ahsolvte frequencies ,,, 



n^SH 


raq 


THT55 




EQUE-... 


RO 


T.ARS 


RR 








NAUG. 


«6 


TTOM 


2U 

.isa; 


HREE. 


1.R4 






MINE 


BRIO 


«6 


MORE- 


7T 


RTW 


135 

isa 


DRED 


6 a- 






RTOH 


FROM.... 


MT 


TEEN. 


168 


HTRT 


97 

98 


IRJTY 


fifl 


WENT. 


153 


HTnn 






SEVE 


121 


FICE. 


50 


ASHT.... 


64 


MENT... ... 

1?.R»IT 


111 

. m 2 


LLAR. 

nr.T.A 


7l 

70 


‘ STOP.... 
’ erttoip' 


":A • ^ ' : T^i /• ; 


REQU. 


98 


.154 

CQ 


IIEOT 


R7 




STAT.... 


-^54 


VENT. 

PERI... 

ffRNT 


70 


nM|U|A 


71 


67 

V 9 




QUES.... 


87 






enty „ 


161 


HUND.... 


64 


FFIC 


62 


RNTH 


114 


QURT-... 


56 


WMirs 


62 


AUGH.... 


52 


TfSHT 


1 AO 


UNDR. 


59 








56 






EVEN.... 


168; 






ROHR 


144 

93 


IVED.... 


82 


•PHRR 


149 


fiOMM 






THTR 


104 


DOLL 


68 


TWEN.... 


152 


Table 11-D.- 


-The 64 ietragraphs (Appearing 60 or more times in the 60,000 letters of Oovemment 



plmvr4ext telegrams arranged first edphabetieaUy according to their third letters and then aeeo e d in g^ 
to their absolute frequendea 



T.I.AR 


71 


EIGH 


132 


ocnni 


88 


STAT 


54 


AUGH. 


59 


OMUA 


n 


FICE 


50 


TGHT 


140 


WENT. 


laa 






A55HT 


A4 


NINE. 


168 


UNDR. 


59 


TVnjrp 




MENT. 


m 






UVj1T1_ 


£>0 


EENT. 


102 


EVEN. 


168 


THIR. 


,104 


VENT. 


70 


TEEN . 


16.8 


HUND 


64 


TWRM 


152 


THIS. 


68 


GENT 


52 


HREE 


134 


ERIO 


66 






qiIE.S 


R7 


FFIC 


62 


TION 


21R 


DRED 


63 






STOP 


154 


TVED 


62 


OT.T.A 


70 


RTOD 


63 


RTEE. 


59 


DOLL. 


68 


FROM. 


59 
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Tablb 11-D, Concluded . — The 54 tetragraphs appearing SO or more times in the 50,000 letters of 
Government plain-text telegrams arranged first alphahetieaUy according to their third letters and 
then according to their aisolvte frequencies 



REQU 


98 


OURT 


na 


TRTY 


Rtt 






DA.SH 


1S2 


FOUR 


144 


THRE . 


149 


TTEST 


S7 


EQUE 


66 


HTRT 


97 






MAIIR 


66 


NDRE 


77 


ENTY 


161 






TJIR55 


68 


ENTH 


114 


FTVE 


1.6.6 


PERI 


67 


ENTS 


62 


SEVE. 


121 



Table 11-E . — The 54 tetragraphs appearing 50 or more times in the 50, OW) letters of Government 
plaintext telegrams arrangedfirst eUphabetieaUy according to their final letters and then according 
to their absolute frequencies 



OMMA 


71 


BASH 


132 


quES 


87 


OLI.A 


70 


EIGH 


132 


THT.*? 


68 






ENTH 


114 


T.AR.R 


68 






AfVan 


.62 


ENTS 


62 


FFIC 


62 














PERI 


67 














WENT 


153 


HUND 


6l4 


nm.T. 


68 


TGHT 


lAfK 


HRED 


63 






MENT 


111 


RTOD 


63 


COMM 


93 


EENT 


102 


IVED 


62 


FROM 


50 


HTRT 


97 . 










UEOT. 


: . f* 
87 






TION..... .... . 


218 


VENT 


70 


NINE 


153 


EVEN;.. 


168 


ASHTL.*- 


64 


THRE 


149 


TEEN 


16.6 


ut3W _ 


56 


FIVE 


135 


TWEN 


152 


OllRT 


56 


HREE 


134 






STAT 


54 ;: 


.S35VE 


121 


ERTO 


66 


CENT _ .. . 


62 


EQUE 


86 










NT1RE 


77 




154 




' •< 


RTEE_„.*_ 


59 






RF4JII 


98 


FTCE . . 


50 


FOUR 


144 










THIR. 


. ... 104 










T.I.AR 


71 


ENTY 


Iftl 


NAUG- _ 


56 


UNDR. 


59 


IRTY 


59 
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Tabls 12 . — Average and mean lengths oj words 



Number of 
letters in 
word 


Number of 
times word 
appears 


Number of 
letters 


1 


378 


378 


2 


973 


1,946 


3 


1, 307 


3,921 


4 


1, 635 


6,540 


5 


1, 410 


7,050 


6 


1, 143 


6,858 


7 


1,009 


7,063 


8 


717 


5,736 


9 


476 


4,284 


10 


274 


2,740 


11 


161 


1, 771 


12 


86 


1,032 


13 


23 


299 


14 


23 


322 


15 


4 


60 


120 


9,619 


50,000 



(1) Mean length of meBsoges 5.2 Letten. 

(2) Average length of messages 217 Letters. 

(3) Mean length of messages 191 Letters. 

(4) Mode (most frequent) length 106-114 Letters. 



(5) It is extremely unusual to find 5 consecutive letters without at least one vowel. 

(6) The average number of letters between vowels is 2. 





Accented letten .! 

Alphabets: 

Bipartite 

Deciphering 

Direct standard. ... 

Endphering ... 

Keyword-mixed. 

Mixed. ! .... 



Beveiaed standard 

Standard 



Systematically mixed..... .... 

Analytic key for cryptanalysis 

Arbitrary symbols 

Assumptions 

Average, lei^^ of messages 

Baconian cipher 

Beginnings of messages 

Biliteral substitution 

Bipartite alphabet 

Blimks, number of.. 

Book systems 

Censorship, methods for evading 

Characteristic frequency of the letters of a language... 

Characteristic frequency, suppression of 

Checkerboard systems 

Checkerboards, 4n9quue 

Chinese Official Telegraph Code. 

Cipher: 

Baconian 

Component 

Distinguished from code 

Text, length of, as compared with plain text 
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Section I 



INTRODUCTOBT REMARKS 

Tminph 

The esaentiol difference between monoalphabetio and polyalphabetic subgiitutLon. 1 

Primary classification of polyalphabetic systems ^ 2 

Primary cHassification of periodic systems ' 3 

Sequence of study of polyalphabetic systems i i 4 



1. The essential difference between monoalphabetio and poljralphabetic snbstitntion.— a. 
In the substitution methods thus far discussed it has been pointed out that their basic feature 
is that of monoalphabeticity. From the cryptanalytic standpoint, neither the nature of the 
cipher symbols, nor their method of production is an essential feature, although these may be 
differentiatmg characteristics from the cryptographic standpoint. It is true that in those cases 
designated as monoalphabetio substitution with variants or multiple equivalents, there is a 
departure, more or less considerable, from strict monoalphabeticity. In some of those caeea, 
indeed, there may be available two or more wholly independent sets of equivalents, which, 
moreover, may even be arranged in the form of completely separate alphabets. Thus, while a 
loose terminology might permit one to designate such systems as polyalphabetic, it is better to 
reserve this nomenclature for those cases wherein polyalphabeticity Is the essence of the method, 
specifically introduced with the purpose of imparting a positioned variation in the substitutive 
equivalents for plain-text letters, in accordance with some rule directly or indirectly connected 
with the absolute positions the plain-text letters occupy in the message. This point calls for 
amplification. 

b. In monoalpbabetic substitution with variants the object of having different or multijple 
equivalents is to suppress, so far as possible by simple methods, the characteristic frequencies 
of the letters occurring in plain text. As has been noted, it is by means of these characteristic 
frequencies that the cipher equivalents can usually be identified. In these systems the varying 
equivalents for plain-text letters are subject to the free choice and caprice of the enciphering 
clerk; if he is careful and conscientious in the work, he will really make use of all the different 
equivalents afforded by the system; but if he is slip-shod and hurried in his work, he will. use the 
same equivalents repeatedly rather than take pains and time to refer to the charts, tables, or 
diagrams to find the variants. Moreover, and this is a crucial point, even if the individual 
enciphering clerks are extremely careful, when many of them employ the same system it is entirely 
impossible to insxu'e a complete diversity in the encipherments produced by two or more clerin 
working at different message centers. The result is inevitably to produce plenty of repetitions 
in the texts emanating from several stations, and when texts such as these are ^ available for 
study they are open to solution, by a comparison of their similarities and differences* 

c. In true polyalphabetic sjrstems, on the other hand, there is establi^ed a rather definite 
procedure which automatically determines the shifts or chaj^es in equivalents or in the manner 
in which they are introduced, so that these changes are beyond the momentary whim or choice of 
the encipherii^ clerk. When the method of shifting or charging the equivalents is scientifically 
sound and sufficiently complex, the research necessary to establish the values of the cipher 
characters is much more prolonged and difficult than is the case even in complicated monoalpha- 
betic substitution with variants, as will later be seen. These are the objwts of true polyalpha- 
betic substitution systems. The number of such systenos is quite large, and it will be possible to 

( 1 ) 







2 



describe in detail the cryptanalysis of only a few of the more common or typical examples of 
methods encoimtered in practical military communications. 

d. The three methods, (1) single-equivalent monoalphabetic substitution, (2) monoalpha- 
betic substitution with variants, and (3) true polyalphabetic substitution, show the following 
relationships as regards the equivalency between plain-text and cipher-text units: 

A. In method (1), there is a set of 26 symbols; a plain-text letter is always represented by 
one and only one of these symbols; conversely, a symbol always represents the same plain-text 
letter. The equivalence between the plain-text and the cipher letters is constant in both enci- 
pherment and decipherment. 

B. In method (2), there is a set of n symbols, where n may be any number greater than 26 
and often is a multiple of that number; a plain-text letter may be represented by 1, 2, 3, . . . 
diffeimit symbols; conversely, a symbol always represents the same plain-text letter, the same as 
is the case in method (1). The equivalence between the plain-text and the cipher letters is 
variable in encipherment but constant in decipherment.* 

C. In method (3) there is, as in the first method, a set of 26 symbols; a plain-text letter 
may be represented by 1, 2, 3, . . . 26 different symbols; conversely, a symbol may represent 
1, 2, 3, . . . 26 different plain text letters, depending upon the system and the specific key. 
The equivalence between the plain-text and the cipher letters is variable in both encipherment 
and decipherment. 

2. Primary classification of polyalphabetic systems. — a. A primary classification of poly- 
alphabetic systems into two rather distinct types may be made: (1) periodic systems and (2) 
aperiodic systems. When the enciphering process involves a cr 3 rptographic treatment which is 
repetitive in character, and which results in the production of cydic phenomena in the crypto- 
graphic text, the system is termed periodic. When the enciphering process is not of the type 
described in the foregoing general terms, the system is termed aperiodic. The substitution in 
both cases involves the use of two or more cipher alphabets. 

h. The cyclic phenomena inherent in a periodic system may be exhibited externally, in 
which case they are said to be patent, or they may not be exhibited externally, and must be im- 
covered by a preliminary step in the analysis. In which case they are said to be latent. The 
periodicity may be quite definite in nature, and therefore determinable with mathematical 
exactitude allowing for no variability, in which case the periodicity is said to hejixed. In other 
instances the periodicity is more or less flexible in character and even though it may be deter- 



* There is a monoalphabetio method in which the inverse result obtains, the correspondence being constant 
in enoipberme it but variable in dedpherment; this is a method not found in the usual books on cryptogn^hy 
but in an essay on that subject by Edgar Allan Poe, entitled, in some editions of his works, A few words on secret 
writing and, in other editions. Cryptography. The method is to draw up an enciphering alphabet such as the 
foUowing (using Poe’s example): 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher. SUAVITERINMODOFORTITERINRE 



In such an alphabet, because of repetitions in the cipher component, the plain-text equivalents are subject to a 
considerable degree of variability, as will be seen in the decipherihg alphabet: 



Cipher. 

Plain 



A 



C D E F 
H 6 0 
U 
Z 



6 H I 
E 
X 
S 

w 



K L M N 0 
K J L 
X M 
P 



QRSTUVWXTZ 
H A F B D 
Q R 
V T 
T 



This type of variability gives rise to ambiguities in decipherment. A cipher group such as TIE, would yield 
such plain-text sequences as REG, FIG, TEU, REU, etc., which could be read only by context. No system of such a 
diaracter would be practicBl for serious usage. For a further discussion of this tyi>e of dpher alphabet see 
Friedman, William F., Edgar Allan Poe, Cryptographer, Signal Corps Bulletins Nos. 97 and 98, 1937-38. 



i 
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minable mathematically, allowance must be made for a degree of variability si^bject to limits 
controlled by the specific system under investigation. The periodicity is in this case said to be 
■Aexihle, or variaMe wiithvn limits. 

3. Primary classification of periodic systems. — a. Periodic polyalphabetic substitution 
systems may primarily be classified into two kinds: 

(1) Those in which only a few of a whole set of cipher alphabets are used in enciphering 
individual messages, these alphabets being employed repeatedly in a fixed sequence throughout each 
message. Because it is usual to employ a secret word, phrase, or number as a key to determine 
the number, identity, and sequence with which the cipher alphabets are employed, and this 
key is used over and over again in encipherment, this method is often called the repeafing-Jcey 
system, or the repeating-alphdbet system. It is also sometimes referred to as the mvUiple-alphaf- 
bet system because if tiie keying of the entire message be considered as a whole it is composed 
of multiples of a short key used repetitively.* In this text the desi^iation “repeating-key 
system” will be used. 

(2) Those in which all the cipher alphabets comprising the complete set for the system are 
employed one after the other successivdy in the encipherment of a me^ge, and when the 
last alphabet of the series has been used, the encipherer b^ins over again with the first alphabet. 
This is commonly referred to as a progressive-alphabet system because the cipher alphabets are 
used in progression. 

4. Sequence of study of polyalphabetic systems. — a. In the studies to be followed in con- 
nection with polyalphabetic systems, the order in which the work will proceed conforms very 
closely to the classifications inade in paragraphs 2 and 3. Periodic polyalphabetic substitution 
ciphers will come first, because they are, as a rule, the simpler and because a thorou^ under* 
standing of the principles of their analysis is prerequisite to a comprehension of how aperiodic 
systems are solved. But in the final analysis the solution of examples of both types rests upon 
tile conversion or reduction of polyalphabeticity into monoalphabeticity. If this is possible, 
solution can always be achieved, granted there are sufiident data in the final monoalphabetic 
distributions to permit of solution by recourse to the ordinary prinriples of frequency. 

h. First in the order of study of periodic systems will come the analysis of repeating-key 
systems. Some of the more simple varieties will be discussed in detail, with examples. Subse- 
quently, ciphers of the progressive type will be discussed. There wiU- then follow a more or less 
detailed treatment of aperiodic systems. 



* Ecenoh terminology calls this the “double-key method”, but there is no logic in such nomenclature. 






i 




> 

> 



j ^ 

I ' 



Section II 

CIPHEB ALPHABETS FOB POLYALPHABETIC SUBSTITUTION 



PsiBgnpb 

CSlassification of cipher alphabets upon the baais of their derivation 6 

Primary componehta and secondary alphabets 6 

Primary oomponents, eipher disks, and square tables — 7 



5. Classification of cipher alphabets apon the basis of their derivation. — a. The substitu- 
tion processes in polyalphabetic methods involve the use of a plurality of cipher alphabets. 
The latter may be derived by various schemes, the exact nature of which determines the principal 
characteristics of the cipher alphabets and plays a very important role in the preparation and 
solution of polyalphabetic cryptograms. For these reasons it is advisable, before proceeding to a 
discussion of the principles and methods of analysis, to point out these various t3rpes of ciph^ 
^phabets, show how they are produced, and how the method of their production or derivation 
may be made to yield important clues and ^ort-cuts in analysis. 

b. A primary dassification of cipher alphabets for polyalphabetic substitution may be made 
into the two following types: 

(1) Independent or unrelated cipher alphabets. 

(2) Derived or interrdated cipher alphabets. 

e. Independent dpher alphabets may be disposed of in a very few words. They axe merdy 
separate and distinct alphabets showing no relationship to one another in any way. They may 
be compiled by the various methods discussed in Section IX of Elementary Military Cryptography. 
The solution of cryptograms fmtten by means of such alphabets is rendered more difficult by 
reason of the absence of any relationsMp between the equivalents of one dpher alphabet and 
those of any of the other alphabets of the same cryptogram. On the other hand, from the point of 
view of practicability in their production and their handling in cryptographing and decryptograph- 
ing, they present some difficulties which make them less favored by cryptographers than cipher 
alphabets of the second type. 

d. Derived or interrdated alphabets, as their name indicates, are most commonly pmduced 
by the interaetion 6j two primary compondits, which when juxtaposed at the various points of 
coincidence can be made to yidd secondary alphabets.^ 

6. Primary components and secondary alphabets. — Two basic, slidable sequences or com- 
ponents of n characters each will yidd n secondary alphabets. The components may be dassi- 
fied according to various schemes. For ciyptanalytic purposes the following classification will be 
found useful: 

Case A. The primary components are both normal sequences. 

(1) The sequences proceed in the same direction. (The secondary alphabets are direct 
standard alphabets.) (Pars. 13-15.) 

(2) The sequences proceed in opposite directions. (The secondary alphabets are reversed 
standard alphabets; they are also reciprocal dpher alphabets.) (Far. 13i, 14g.) 

Case B. The primary components are not both normal sequences. 

(1) The plain component is normal, the cipher component is a mixed sequence. (The 
secondary alphabets are mixed alphabets.) (Par. 16-25.) 

> See Sec. YIII and IX, Blemeniary Military Cryptography. 

( 4 ) 
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(2) The plain component is a mixed sOquencej tha mpher component is normal. (The 
secondary alphabets aro mixed alphabets.) (Par. 28.) 

(3) Both components are mixed sequences. 

(a) Components are identical mixed sequences. 

I. Sequences proceed in the same direction. (The secondary alphabets are 
mixed alphabets.) (Par. 28.) 

II. Sequences proceed in opposite directions. (The secondary alphabets are 
reciprocal mixed alphabets.) (Par. 38.) 

(b) Components are different mixed sequence. Cl'he secondary alphabets are mixed 

alphabets.) (Par. 39.) 

7. Primary components, cipher disks, and square tables.— a. In preceding texts it has 
been shown that the equivalents obtainable from the use of quadricular or square tables may be 
duplicated by the use of revolving cipher disks or of sliding primary components. It was also 
stated that there are various ways of employing such tables, dis^, and sliding components. 
Ciypto^aphically the results may be quite diverse from different methods of using such para- 
phernalia, since the specific equivalents obtained from one method may be altogether different 
from those obtained from another method. But from the cryptanalytic point of view the 
diversity referred to is of little significance; only in one or two cases does the specific method of 
employing these cryptographic instrumentalities have an important bearing upon- the procedure 
in cryptanalysis. However, it is advisable that the student learn something about these different 
methods before proceeding with further work. 

b. There are, not two, but /our letters involved in every case Of finding equivalents by means 
of sliding primary components; furthermore, the determinalion of an equivalent for a given 
plain-text letter is representable by two equations involving four elements, usually letters. 
Three of these letters are by this time well-known to and understood by the student, viz, 6k, 6p, 
and 0,. The fourth element or letter has been passed over without much comment, but crypto- 
graphically it is just as important a factor as the other three. Its function may best be indicated 
by noting what happens when two primary components are juxtaposed, for the pirrpose of finding 
equivalents. Suppose these components are the following sequences: 

(1) ABODE FGHIJKLMNOPQRSTUVWXYZ 

(2) FBPYRCQZIGSEHTDJUMKVALWN0X 

Now suppose one is merely asked to find the equivalent of Pp when the key letter is K. Without 
further specification, the cipher equivalebt cannot be stated; for it is necessary to know not only 
which K will be used as the key letter, the one in the component labeled (1) or the one in the 
component labeled (2), but also what letter the Kk wiU be set against, in order to juxtapose the 
two components. Most of the time, in preceding texts, these two factors have been tacitly 
assumed to be fixed and well understood: the is sought in the mixed, or cipher component, 
and this K is set against A in the normal, or plain component. Thus: 

Plain Index 

i A 

(1) Plain ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 

(2) Cipher....... FBPYRCQZIGSEHTDJUMKVALWNOX 

T I 

Cipher Key 

With this setting Pp=Z,. 






6 

e. The letter A in this case may be termed the index letter, symbolized A|. The index letter 
constitutes the fourth element involved in the two equations applicable to the finding of equiva- 
lents by sliding components. The four elements are therefore these: 

(1) The key letter, 6k 

(2) The index letter, 6| 

(3) Hie plain-text letter, 6^ 

(4) The cipher letter, 9, 

The index letter is commonly the initial letter of the component; but this, too, is only a con- 
vention. It might be any letter of the sequence constituting the component, as agreed upon by 
the correspondents. However, in the subsequent discussion it untl be assumed that the index letter 
is the initial letter of the component in which it is located, unless otherwise stated. 

d. In the foregoing case the enciphering equations are as follows: 

a) Kk=A.; Pp=Z, 

But there is nothing about the use of sliding components which excludes other methods of finding 
equivalents than that shown above. For instance, despite the labeling of the two components 
as shown above, th^ is nothing to prevent one from seeking the plain-text letter in the com- 
ponent labeled (2), that is, the cipher component, and taking as its cipher equivalent the letter 
opposite it in the other component labeled (1). Thus: 

Cipher Index 

i i 

(1) ABCDEFXIHIJiaJmOPQRSTUVWXYZABCDEFGHIJKlJlNOPQRSTUVWXYZ 

(2) FBPYRCQZIGSEHTDJUHKVALWNOX 

T t 

Plain Key 

Thus: 

(U) Kk=A,; Pp=K, 

e. Since equations (I) and (II) yield different resultants, even with the same index, key, 
and plain-text letters, it is obvious that an accurate formula to cover a specific pair of enciphering 
equations must include data showing in what component each of the four letters comprising the 
equations is located. Thus, equations (I) and (II) should read: 

(I) Kk in component (2)=Ai in component (1); Pp in component (1)=Z, in component (2). 

(II) Kk in component (2)=A| in component (1); Pp in component (2)= K, in component (1). 

For the sake of brevity, the following notation will be used: 

(1) ^n=hin‘, Pp/i=Z,/j 

(2) I^/»=A|/i; Pp/t=Ks/i 

/. Employing two sliding components and the four letters entering into an enciphering 
equation, there are, in all, twelve different resultants possible for the same set of components 
and the same set of four basic elements. These twelve differences in resultants arise &om a set 
of twelve different enciphering conditions, as set forth bdow (the notation adopted in sub- 
paragraph e is used): 

(1) Qk/*=01/i; 0pA=®c/» (7) 6»/*=9p/i; 6|0=6eA 

(2) 9kn— 0IA; 0pA=®«A (8) 6kA=0«A» ®l/»=0pA 

(3) 9kA=®lflj 0pA=®ea (2) 0kA=0pfli 0IA=0efl 

(4) 9kA=®lAJ 0pfl=0«A (10) 0ltA=®efl; 01A=®pfl 

(5) 9kfl=^Af 6 ia=6»o (11) 0iiA=0pfli 0ia=0«A 

(6) 0kfl=0«A» 0|A=^« (12) 0kA'='0*ai 0|/l=^A 
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g. The twelve resultants obtainable from juxtaposing sliding components ae iiidicated under 
the preceding subpart^rapb may abo be obtained either from one square table, in which case 
twelve different methods of finding equivalents must be applied, or from twelve different square 
tables, in which case one standard method of finding equivalents will serve all purposes. 

h. If but one table such as that shown below as Table 1-A is employed, the various methods 
of findii^ equivalents are difficult to keep in mind. 

Table I-A 



ABCDEFGHIJKLHNOPQRSTUVWXYZ 



F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


B 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


V 


N 


0 


X 


F 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


H 


K 


V 


A 


L 


V 


N 


|j 


X 


F 


B 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


H 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


ff 


N 


0 


X 


F 


B 


P 


Y 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


H 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


□ 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


G 


S 


E 


H 


T 


D 


J 


U 


U 


K 


V 


A 


B 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


□ 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


m 


E 


H 


T 


D 


J 


U 


U 


K 


V 


A 


L 


□ 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


z 


I 


G 


s 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


ff 


□ 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


T 


D 


J 


U 


M 


K 


V 


A 


L 


W 


N 
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P 
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R 


C 
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H 


D 


J 
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H 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


J 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 
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L 


W 


N 


0 


X 


F 


B 
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D 
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M 


K 


V 
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Z 
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E 


H 


T 


D 


J 


U 
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L 


W 


N 


0 



For example: 

(1) For enciphering equations 6iky3=6i/i; Op/i=6«/3: 

Locate 0p in top sequence; locate 6k in first colmnn; 

Go is letter within Ihe square at intersection of the two lines thus determined. 
Thus: 



Pp/t=Z«/s 











(2) For enciphering equatiozis 6k/t=0i/i; dp/i=6e/i: 

Locate 0k in first column; follow line to right to 0p; proceed up this column; 0, is 
letter at top. 

Thus: 

l^/i=Ai/i; Pp/j=K,/i 



(3) For enciphering equations 0k/i=0i/3; 0p/i=0<i/a: 

Locate 0k in top sequence and proceed down column to 6i ; 

Locate 0p in top sequence; 0, is letter at other comer of rectangle thus formed. 
Thus: 

Kk/i=Ai/2; Pp/i=Xefl 



Only three different methods have been shown and the student no doubt already has encountered 
difficulty in keeping them segregated in his mind. It would obviously be very confusing to try 
to remember aU twelve methods. But if one standard or fixed method of finding equivalents is 
followed with several different tables, then this difficulty disappears. Suppose that the following 
method is adopted: Arrange the square so that the plain-text letter may be sought in a separate 
sequence, arranged alphabetically, above the square and so that the key letter may be sought 
in a separate sequence, also arranged alphabetically, to the left of the square; look for the plain- 
text letter in the top row; locate the key letter in the 1st column to the left; find the letter stand- 
ing within the square at the intersection of the vertical and horizontal lines thus determined. 
Then twdve squares, equivalent to the twelve different conditions listed in subparagraph/, can 
readily be constructed. They are all shown in Appendix 1, pp. 96-107. 

i, "When these square tables are examined carefully, certain interesting points are noted. 
In the first place, the tables may be paired so that one of a pair may serve for enciphering and the 
other of the pair may serve for deciphering, or vice versa. For example, tables I and II bear this 
reciprocal relationship to each other; HI and IV, V and VI, VII and VIII, IX and X, XI and 
XII. In the second place, the internal dispositions of the letters, although the tables are derived 
from the same pair of components, are quite diverse. For example, in table I-B the horizontal 
sequences are identical, but are merely displaced to the right and to the left different intervals 
according to the successive key letters. Hence this square shows what may be termed a hor- 
izontally-displaced, direct symmetry of the cipher component. Vertically, it ^ows no symmetry, 
or if there is symmetry, it is not visible.^ But when Table I-B is more carefuUy examined, an 
invisible; or indirect, vertical symmetry may be discerned where at first glance it is not apparent. 
If one takes any two columns of the table, it is found that the interval between the members of 
any pair of letters in pne column is the same as the interval between the members of the homolo- 
gous pair of letters in the other column, ^ the disUmce is metiswed <m the dphet component. For 
example, consider the 2d and 16th columns (headed by L and I, respectively) ; take the letters P 
and 6 in the 2d column, and J and W in the 15th column. The distance between P and G on the 
cipher component is 7 intervals; the distance between J and W on the same pomponent is also 
7 intervals. This phenomenon implies a kind of hidden, or latent, or indirect symmetry within 
the cipher square. In fact, it may be stated that every table which sets forth in systematic fashion 
the various secondary alphabets derivable by sliding two primary sequences though all points of 
coincidence to find cipher equivalents must show some kind of symmetiy, both horizontally and 

> It is true that tbs first Column within the table shows the plain-component sequence, but this is merely 
because the method of finding the equivalents in this case is such that this sequence is bound to appear in that 
column, since the successive key letters are k, B. C, . . . Z. and this sequence happens to be identical with 
the plain component in this case. The same is true of Tables V and XI; it is also applicable to the first row of 
Tables IX and X. 
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vertically. The symmetry may be termed tisMe or dired, if the sequences of letters in the rows 
(or columns) are the same throughout and are identical with that of one of the primary com- 
ponents; it may be termed hidden or indirect if the sequences of letters in the rows or columns 
are different, apparently not related to either of the components, but are in reality decimations 
of one of the primary components. 

j. When the twelve tables of Appendix 1 are examined in the light of the foregoing remarks, 
&e type of symmetry found in each may be summarized in the following manner: 





Horizontal ! 


Vartfoal 


Table 


Viaibla or direct 


InTlalbla or Indiiaot 


Visible oc dlraot 


Invisible or Indirect 




Fallows 

plain 

component 


Follows 

cipher 

component 


Follows 

plain 

component 


Follows 

cipher 

component 


Follows 

plain 

oomponeiit 


Follows 

cipher 

component 


Follows 

component 


Follows 

cipher 

component 




Of these twelve types of cipher squares, corresponding to the twelve different ways of using a 
pair of sliding primary components to derive secondary alphabets, the ones best known and 
most often encoimtered in cryptographic studies are Tables I-B and II, referred to as being of 
the Vigendre type; Tables V and VI, referred to as being of the Beaufort type; and Tables IX 
and X, referred to as being of the Delastelle type. It will be noted that the tables of the Dda- 
stelle type show no direct or visible symmetry, either horizontally or vertically and because of 
this are supposed to yield more security than do any of the other types of tables. But it will 
presently be shown tMt the supposed increase in security is more illusory than real. 

k. The foregoing facts concerning the various types of quadricular tables generated by diverse 
methods of using sliding primary components or their equivalent rotating cipher disks will be 
employed to good advantage, when the studies presently to be undertaken will bring the student 
to the place where he can comprehend them in the analysis of polyalphabetic systems. But in 
order not to confuse him with a multiplicity of details which have no direct bearing upon basic 
principles, one and only one standard method of finding equivalents by means of sliding compo- 
nents will be selected from among the twelve available, as set forth in the preceding subpara- 
graphs. Unless otherwise stated, this method will be the one denoted by the first of the formulae 
listed in subpar./, viz; 

Calling the plain component “1” and the cipher component “2”, this will mean that the keyletter 
on the cipher component will be set opposite the index, which wiU be the first letter of the plain 
component; the plain-text letter to be enciphered will then be sought on the plain component and 
its equivalent will be the letter opposite it on the cipher component. 
























Section III 

THEORY OF SOLUTION OF BEPEATING-KEY SYSTEMS 



Fangnpb 

The three steps in the analysis of repeatlng-hey systems 8 

First step: finding the length of the period 9 

General remarks on factoring. 10 

Second step: distributing the cipher text into the component monoalphabets 11 

Third step: solving the monoalphabetio distributions 12 



8. The three steps in the analysis of repeating-key systems. — a. The method of enciphering 
according to the principle of the repeating key, or repeating alphabets is adequately explained in 
Section XI of Metnenlary Military Cryptography, and no further reference need be made at this 
time. The analysis of a cryptogram of this type, regardless of the kind of cipher alphabets 
employed, or their method of production, resolves itself into three distinct and successive steps. 

(1) Determination of the length of the repeating key, which is the same as the determination 
of the exact number of alphabets involved in the cryptogram; 

(2) Allocation or distribution of the letters of the cipher text into the respective cipher alpha- 
bets to which they belong. This is the step which reduces the polyalphabetic text to mono- 
alphabetic terms; 

(3) Analysis of the individual monoalphabetio distributions to determine plain-text values of 
the cipher letters in each distribution or alphabet. 

h. The foregoing steps will be treated in the order in which mentioned. The first step may 
be described briefly as .that of determining the period. The second step may be described briefly 
as that of reduction to monoalphabetic terms. The third step may be designated as identification cj 
cipher4ext values. 

9. First step: finding the length of the period. — a. The determination of the period, that 
is, the length of the key or the number of cipher alphabets involved in a cryptogram endphered 
by the repeating-key method is, as a role, a relatively simple mattw. The cryptogram itself 
usually manifests externally certain phenomena which are the direct result of the use of a repeat- 
ing key. The principles involved are, however, so fundamental in cryptanal^is that their 
elucidation warrants a somewhat detmled treatment. This will be done in connection with a 
short example of encipherment, shown in Fig. 1. 

Mbbsaoe 

THE ARTILLERY BATTALION MARCHING IN THE REAR OF THE ADVANCE GUARD KEEPS 
ITS COMBAT TRAIN WITH IT INSOFAR AS PRACTICABLE. 

( 10 ) 
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[Key: BLUE, aring dinet etaaderd ^pbabets] 
GifHSB iiXPHABKTB 



Plain 




A B 


CDEFGHIJKLIINOPQRSTUVirXYZ 




f(l) 


B C 


DEFGHIJKLHNOPQRSTUVWXYZA 


Cipher^ 


(2) 


LHNOPQRSTUVWZYZAB 


CDEF6HIJK 




(3) 


UVWXYZABCDEFGHIJKLHNOPQRST 




(4) 


EFGHIJKLUNOP 


QRSTUVWXYZABCD 


BLUE 




BLUE 


BLUE 


BLUE 


T H E A 




A R D K 


T H E A 
U S Y E 


A R D K 
B C X 0 


] 


R T I L 




E E P S 


R T I L 
S E C P 


E E P S 
F P J W 


] 


L E R y 




I T S C 


L E R Y 
H P L C 


I T S C 
J E H G 


B A T T 




0 H B A 


B A T T 
C L N X 


0 M B A 
P X V E 


J 


& L I 0 




T T R A 


ALIO 
B W C S 


T T R A 
U E L E 


N M A R 




I N W I 


N U A R 
0 X U V 


I N V I 
J Y Q II 


1 


CHIN 




T H I T 


CHIN 
D S C R 


T H I T 
U S C X 


( 


SINT 




I N S 0 


G I N T 
H T H X 


I N S 0 
J Y M S 


HERE 




F A R A 


HERE 
I P L I 


F A R A 
G L L E 


A R 0 F 




S P R A 


A R 0 F 
B C I J 


S P R A 
TALE 


1 


I H E A 




C T I C 


T H E A 
U S Y E 


C T I C 
D E C G 


1 

1 


CYAN 

C E G U 
a 




ABLE 

a 


D V A N 
E G U R 

C E G U 
D p A y 
b 


ABLE 
B 11 F I 

b 



Cbtftogbam 



U S Y E S 


E C P M P 


L C C L N 


X B ff C S 


0 X U Y D 


S 


C R H T 


H X I P L 


I B C I J 


U S Y E E 


G U R D P 


A Y B C X 


0 


F P J W 


J E U G P 


X V E U E 


L E J Y Q 


H U S C X 


J Y H S G 


L 


L E T A 


L E D E C 


G B H F I 













Tmnu L 






On)T, 



b. Kegardless of what system Is: used, identical plain-text letters enciphered by the same 
cipher alphabet ^ must yield identical ciphci^ letters. Keferring to Fig. 1, such a condition is 
brou^t about every time that identical plain-text letters happen to be enciphered with the same 
key-letter, or every time identical plain-text letters fall into the same column in the encipher- 
ment.^ Now since the number of columns or positions with respect to the key is very limited 
(except in the case of very long key words), and since the repetition of letters is an inevitable 
condition in plain text, it follows that there will be in a message of fair length many cases where 
identical plain-text letters must fall into the same column. They will thus be enciphered by the 
same cipher alphabet, resulting, therefore, in the production of many identical letters in the 
cipher text and these will represent identical letters in the plain text. When identical plain-text 
polygraphs fall into identical columns the result is the formation of identical cipher-text poly- 
graphs, that is, repetitions of groups of 2, 3, 4, . . . letters are exhibited in the cryptogram. 
Repetitions of this type will hereafter be called causal repetitions, because they are produced by 
a definite, traceable cause, viz, the encipherment of identical letters by the same dpher alphabets. 

c. It will also happen, however, that different plain-text letters falling in different columns 
win, by mere accident, produce identical dpher letters. Note, for example, in Fig. 1 that in 
Column 1, Rp becomes S, and that in Column 2, Hp also becomes S,. The production of an identical 
cipher text letter in these two cases (that ii^ a repetition where the plain-text letters are different 
and enciphered by different alphabete) is merely fortuitous. It is, in every day language, “a 
mere coinddence”, or “an acddent.” For this reason repetitions of this type will hereafter be 
called accidental repetitions. 

d. A consideration of the phenomenon pointed out in c makes it obvious that in polyalpha- 
betic ciphers it is important that the cryptanalyst be able to tell whether the repetitions he finds 
in a specific case are causal or acddental in their origin, that is, whether they represent actual 
endpherments of identical plain-text letters by identical keying elements, or mere coincidences 
brought about purely fortuitously. 

e. Now accidental repetitions wiU, of course, happen fairly frequently with individual letters, 
but less frequently with digraphs, because in this case the same kind of an “acddent” must take 
place twice in succession. Intuitively one feels that the chances that such a purely fortuitous 
coinddence will happen two times in succesdon must be much less than that it will happen every 
once in a while in the case of dngle letters. Similarly, intuition makes one feel that the chances 
of such accidents happening in the case of three or more consecutive letters are still less than in 
the case of digraphs, decreasing very rapidly as the repetition increases in length. 

y. The phenomena of cryptographic repetition may, fortunately, be dealt with statistically, 
thus taking the matter outside the realm of intuition and putting it on a firm mathematical or 
objective basis. Moreover, oftw the statistical analysis will tell the cryptanalyst when he has 
arranged or rearranged his text properly, that is, when he is approaching or has reached mono- 
alphabetidty in his efforts to reduce polyalphabetio text to its simplest terms. However, in 
order to preserve continuity of thought it is deemed inadvisable to inject these statistical con- 
siderations at this place in the text proper; they have been incorporated in Appendix 2 hereof. 
The student is advised to study the Appendix very carefully after he has finished reading this 
section of the text. 

g. At this point it will merely be indicated that if a cryptanalyst were to have at hand only 
the cryptpp'am of Fig. 1, with the repetitions underlined as below, a statistical study of the 

* It ia to be understood, of course, that cipher alphabets with single equivalents are meant in this case. 

* The frequmcy with whleh this condition may be expected to occur can be definitely calculated. A dis- 
cussion of this point falls beyond the scope of the present text. 
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number and length of the repetitions within the message (Par. 5 of Appendix 2) would tell him 
that while some of the digraphic repetitions may be accidental, the chances that they all are 
accidental are small. In the case of the tetragraphic repetition he would realize that the 
chances of its being accidental are very small indeed. 



A. 


y g Y F S 


E C P M PL_ 


L C C L N 


X B W C S 


0 X U V D 


B. 


S C R H T 


HXIPJj 


I B C I J 


y. S. LJ E 


G U R D P 


C. 


A Y BOX 


0 F P J W 


J E M G P 


X V E U E 


L E J Y Q 


D. 


M U S C X 


J Y M S G 


L L E T A 


L E D E^ 


G B U F I 



h. A consideration of the facts therefore leads to but one conclution, viz, that the repetitions 
exhibited by the cryptogram under investigation are Twt acdderUal but ore causal in their origin; 
and the cause is in this case not difficult to find: repetitions in the plain text were actually en- 
ciphered by identical alphabets. In order for this to occur, it was necessary that the tetragraph 
USYE, for example, fall hath times in exactly the same relative position with respect to the key. 
Note, for example, that UYSE in Fig. 1 represents in both cases the plain-text polygraph THEA. 
The first time it occurred it fell in positions 1-2-3-4 with respect to the key; the second time it 
occurred it happened to fall in the very same relative positions, although it might just ae well 
have happraed to fall in any of the other three possible relative positions with respect to the 
key, viz, 2-3-4-1, 3-4-1-2, or 4-1-2-3. 

i. Lest the student be misled, however, a few more words are necessary on tius subject. 
In the preceding subparagraph the word “happened” was used; tills word correctly expresses 
the idea in mind, because the insertion or deletion of a single plain-text letter between the two 
occurrences would have thrown the second occurrence one letter forward or backward, respecr 
tively, and thus caused the polygraph to be enciphered by a sequence of alphabets such as con 
no longer produce the cipher polygraph USYE from the plain-text polygraph THEA. On the 
other hand, the insertion or deletion of this one letter might bring the letters of some other 
polygraph into similar columns so that some other repetition would be exhibited in case the 
USra repetition had thus been suppressed. 

j. The encipherment of similar letters by similar cipher alphabets is therefore the cause of 
the production of repetitions in the cipher text in the case of repeating-key ciphers. What 
principles can be derived from this fact, and how can they be employed in tiie solution of ciypto- 
grams of this type? 

ik. If a count is made of the number of letters from and including the first USYE to, but not 
including, the second occurrence of USYE, a total of 40 letters is found to intervene between the 
two occurrences. This number, 40, must, of course, be an exact multiple of the length of the key. 
Having the plain-text before one, it is easily seen that it is the 10th multiple; that is, the 4-letter 
key has repeated itself 10 times between the first and the second occurrence of USYE. It follows, 
therefore, that if the length of the key were not known, the number 40 could safely be taken to 
be an exact multiple of the length of the key; in other words, one of the /actor* of the number 
40 would be equal to the length of the key. The word “safely” is used in the preceding sentence 
to mean that the interval 40 applies to a repetition of 4 letters and it has been tiiown that the 
chances that tiiis repetition is accidental are small. The factors of 40 are 2, 4, 5, 8, 10, and 20. 
So far as this single repetition of USYE is concerned, if the length of the key were not known, all 
that could be said about the latter would be that it is equal to one of these factors. The repeti- 
tion by itself gives no further indications. How can the exact factor be selected from among a 
list of several possible factors? 
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' I, Let the intervals between all the repetitions in the cryptogram be listed. They Are as 
follows: 



Seiietitlan 


Interval 


factors 


1st USYE to 2d USYE 


40 


2, 4, 5, 8, 10, 20. ‘ 


1st BC to 2d BC 


16 


2,4,8. 


1st CX +.n 2d CX 


26 


6. 


1st EC to 2d EC 


88 


2,4,11, 22,44. 


1st LE to 2d LE . _ 


16 


2, 4, 8. 


2d LE to 3d LE 


4 


2 , 4. 


I stLE to .3d LE . 


20 


2, 4, 5, 10. 


lstJYto2djy. 


8 


2, 4. 


1st PL to 2d PL.. 


24 


2, 3, 4, 6, 8, 10, 12. 


1st SC to 2d SC 


52 


2, 4, 13, 26. 


(1st SV to 2d SY, already included in USYE.) 






(1st US to 2d US, already included in USYE.) 






2d US to 3d US 


36 


2, 3, 4, 6, 0, 18. 


(1st US to 3d US, already included in USYE.) 






(1st YE to 2d YE, already included in USYE.) 







tn. Are all these repetitions causal repetitions? It can be shown (Appendix 2, par. 4c) that 
th6 odds against a theory that the UYSE repetition is accidental are about 99 to 1 (since the 
probability for its occurrence is .01 ). It can also be shown that the odds against a theory that the 
10 digraphs which occiir two or more times axe accidental repetitions are over 4 to 1 (Appendix 
2, par. 5c); the odds against a theory that the two digraphs which occur 3 times are accidental 
repetitions are quite large. (Probability is calculated to be about .06.) The chances are very 
great, therefore, that all or nearly all these repetitions are causal. Certainly the chances against 
tha two occurrences of the tetragraph UYSE and the three occurrences of the two different digraphs 
(LE and US) being accidental are quite high, and it is therefore not astonishing that the intervals 
between all the various repetitions, except in one case, contain the factors 2 and . 4. 

, n! ' This means that if the cipher is written out in either 2 columns or 4 columns, all these 
Repetitions (except the CX repetition) would fall into the same columns. From this it follows 
lhat the length of the key is either 2 or 4, the latter, on practical grounds, being more probable 
than the former. Doubts concerning the matter of choosing between a 2-letter and a. 4-letter 
ke;^ \nU be dissolved when the cipher text is distributed into its component uniliteral frequency 
^tributions. 

0.. The repeated digraph CX in the foregoing message is an accidental repetition, as will be 
apparent by referring to Fig. 1. Had the message been longer there would have been more 
adch accideiital repetitions, but> on the other hand, there would be a prbportionately greater 
nuinber of causal repetitions. This is because the phenomenon of repetition in plain test is 
io ^-pervading. 

p. Sometimes it happens that the cryptanalyst quickly notes a repetitioii of a polygraph 6f 
four or more letters, the inteiVal between the first and second occurrences of which has only^ 
two factors, of which one is a relatively small number, the other a relatively high incommen- 
surable number. He may therefore assume at once that the length of the key is equal to the 
smaller factor _ without searching for additional recurrences upon which to corroborate his 
assumption. Suppose, for example, that in a rdatively short cryptogram the interval between 
the first and second occurrences of a polygraph of five letters happens to be a niunber such as 
203, the factors of which are 7 and 29. Evidently the number of alphabets may .at, once be 
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assumed to be 7, unless one is dealing with messages exchanged among correspondents known 
to use long keys. In the latter case one could assume ihe number of alphabets to be 29. 

g. The foregoing method of determining the period in a polyalphabetic cipher is commonly 
referred to the literature as “factoring the intervals between repetitions”; or more often it is 
simply called “factoring.” Because the latter is an apt term and is brief, it will be employed 
hereafter in this text to designate the process. 

10. General remarks on factoring. — a. The statement made in Par. 2 with respect to the 
cyclic phenomena said to be exhibited in cryptograms of the periodic type now becomes clear. 
The use of a short repeating produces a periodicity of recurrences or repetitions collectively 
termed “cydio phenomena”, an analysis of which leads to a determination of the length of the 
period or cycle, and this gives the length of the key. Only in the case of relatively short crypto- 
grams enciphered by a relatively long key does factoring fail to lead to the correct determination 
of the number of cipher alphabets in a repeating-key cipher; and of comrse, the fact that a crypto- 
gram contains repetitions whose factors show constancy is in itself an indication and test of its 
periodic nature. It also follows that if the cryptogram is not a repeatir^-key cipher, then 
factoring will show no definite results, and conversely the fact that it does not yield definite 
results at once indicates that the cryptogram is not a periodic, repeating-key cipbnr. 

b. There are two cases in which factoring leads to no definite results. One is in the case of 
monoalphabetic substitution ciphers. Here recurrences are very plenlifiil as a rule, and the 
intervals separating these recurrences may be factored, bvi the factors vxiU show ru) constancy; 
there will be several factors common to many or most of the recurrences. This in itself is an 
indication of a monoalphabetic substitution cipher, if the very fact of the presence of many 
recurrences fails to impress itself upon the inexperienced cryptanalyst. The other case in which 
the process of factoring is nonsignificant involves certain types of nonperiodic, polyalphabetic 
ciphons. In certain of these ciphers recurrences of digraphs, trigraphs, and even polygraphs 
may be plentiful in a long message, but the intervals between such recurrences bear no definite 
multiple relation to the length of the key, such as in the case of the true periodic, repeating-key 
cipher, in which the alphabets change with successive letters and repeat themselves over and 
over again. 

c. Factoring is not the only method of detenniidng the length of the period of a periodic, 
polyalphabetic substitution cipher; although it is by far the most common and easily applied. 
At this point it will merely be stated that when the message imder study is relatively short in 
comparison with the length of the key, so that there are only a few cycles of cipher text and no 
long repetitions affording a basis for factoring, there are several other methods available. 
However, it being deemed inadvisable to interject the data concerning those other methods 
at this point, they will be explained subsequently. It is desirable at this juncture merely to 
indicate that methods other than factoring do exist and are used in practiced work. 

d. Fundamentally, the factoring process is merely a more or less simple mathematical method 
of studying the phenomena of periodicity in cryptograms. It will usually enable the cryptr 
analyst to ascertain definitely whether or not a given cryptogram is periodic in nature, and if 
so, the length of the period, stated in terms of the cryptographic unit involved. By the latter 
statement is meant that the factoring process may be applied not only in analyzing the periodicity 
manifested by cryptograms in which the plain-text units subjected to cryptographic treatment 
are monographic in nature (i. e. are single letters) but also in studying the periodicity exhibited 
by those occasional cryptograms wherein the plain-text imits are digraphic, trigraphic, or 
»-graphic in character. The student should bear this point in mind when he comes to the study 
of substitution systems of the latter sort. However, the present text will deal solely with cases 
of the former type, wherein the plain-text imits subjected to cryptographic treatment are single 
letters. 










11. SAOoad iMp; diitributing the cipher teat into the component monoelphebete. — a. 
After the number of cipher alphabets involved in the cryptogram has been ascertained, the next 
step is to rewrite the message in groups corresponding to the length of the key, or in columnar 
fashion, whichever is more convenient, and this automatically divides up the text so that the 
letters belonging to the same cipher alphabet occupy similar positions in the groups, or, if the 
columnar method is used, fall in the same column. The letters are thus allocated or distributed 
into the respective cipher alphabets to which they belong. This reduces the polyalphebetic 
text to monoalphabetic terms. 

h. Then separate uniliteral frequency distributions for the thus isolated individual alphabets 
are compiled. For example, in the case of the cipher on page 13, having determined that four 
alphabets are involved, and having rewritten the message in four columns, a frequency distribu- 
tion is made of the letters in Column 1, another is made of the letters in Column 2, and so on for 
the rest of rii.e columns. Ekich of ihe reavlting distrihtcHons is therefore a monoalphabetie frequenq/ 
distribution. If these distributions do not give the characteristic irregular crest and trou^ 
appearance of monoalphabetic frequency distriburions, then the analysis which led to the 
hypothesis as r^ards the number of alphabets involved is fallacious. In fact, the appearance of 
these individual distributions may be considered to be an index of the correctness of the factoring 
]Mrocess; for theoretically, and practically, the mdividual distributions constructed upon the 
eorreet hypothesis wUl tend to conform more closely to the irregular crest and trough appearacne 
of a monoalphabetic frequency distribution than will the graphic tables constructed upon an 
incorrect hypothesis. These individual distributions may also be tested for monoalphabeticity 
by etatisticcd methods. 

12. Third step: aolvuif the monoalphabetic distributions. — ^The difficulty experienced in 

analyring the individual dr isolated frequcmcy distributions depends mostly upon the type of 
cipher alphabets that is used. It is apparent that mixed alphabets may be used just as eaialy as 
standard alphabets, and, of course, the cipher letters themselves give no indication as to wMch 
is the case. However, just as it was found that in the case of monoalphabeticsubstitution ciphers, 
a uniliteral frequency distribution gives clear indications as to whether the cipher alphabet is a 
standard or a mixed alphabet, by the relative positions and extensions of the crests and troughs 
in the table, so it is found that in the case of repeating-key ciphers, uniliteral fluency distribu- 
tiems for the isolated or individual alphabets will also give clear indications as to whether these 
alphabets are standard alphabets or mixed alphabets. Only one or two sudi frequency distribu- 
tions are necessary for this determination; if they appear to be standard alphabets, distri- 

butions can be made for the rest of the alphabets; but if they appear to be mixed alphabets, then 
it is best to compile triliteral frequency distributions for all the alphabets. The analysis of the 
values of the cipher letters in each table proceeds along the same lines as in the case of monoalpha- 
betic ciphers. The analysis is more difficult only because of the reduced sise of the tables, but 
if the message be very long, then eadi frequency distribution will contain a sufficient number of 
elements to enable a speedy solution to be achieved. 
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BEPEATING-EEY SYSTEMS WITH STANDABP aPHER ALPHABETS 
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Solution by the “probaMe^word method” 16 



13. Solution by applying principles of frequency. — a. In the light of the foregoing principles, 
let the following cryptogram be studied: 

Message 

1 2 a 4 5 



A. 


A U K H V 


J A M K I 


Z YJI W H 


J M I G X 


M F M L X 


B. 


E T I H I 


Z H B H R 


A Y M Z M 


I L V M E_ 


J K U T G 


C. 


D P V_X K 


Q U R H Q 


L H V R H 


J A Z N G 


G Z V X E 


D. 


N L U F M 


P Z J N V 


C H U A S 


H K Q G K 


I P L W P 


E. 


A J Z X I 


G U M T V 


D P T E J 


' E C M Y S 


Q Y B A V 


F. 


A L A H Y 


P 0 E X W 


P V N Y E 


E Y X E E 


U 0 P X R 


G. 


B V Z V I 


Z I I V 0 


S P T E G 


K U B B R 


Q L L X P 


H. 


W F 0 G K 


N L L L E 


P T I K W 


D J Z X I 


G 0 I 0 I 


J. 


Z L A U V 


K F M W F 


N P L Z I 


0 V V F_M 


Z K T X G 


K. 


N L H D F 


A A E X I 


J L U F H 


P Z J N V 


C A I G I 


L. 


U A W P R 


N V I W E_ 


^ Z ^ 


Z L A F U 


H S 



A search for repetitions discloses the following short list with the intervals and factors 
above 10 omitted (for previous experience may lead to the conclusion that it is unlikely that the 
cryptogram involves more than 10 alphabets, showing the number of recurrences which it does): 













On)T. 




b. The factor 5 appears in all but two cases, each of which involves only a digraph. It seems 
almost certain that the number of alphabets is five. Since the text already appears in groups of 
five letters, it is unnecessary to rewrite the message. The next step is to make a uniliteral fre- 
queney distribution for Alphabet 1 to see if it can be determined whether or not standard alpha- 
bets are involved. It is as follows: 

Alphabet 1 

g ^ 5 5 S ^ i 5= ^ i - g 5 ^ ^ i 

ABCDEFGHIJKLMNOPQRSTUVWXYZ 

e. Although the indications are not very elear cut, yet if one takes into eonsideration the 
small amount of data the assumption of a direct standard alphabet with We=Ap, is worth further 
test. Accordingly a similar distribution is made for Alphabet 2. 

Alphabet 2 



ABCDEFGHIJKLMNOPQRSTUVWXYZ 

d. There is every indication of a direct standard alphabet, with He=Ap. Let similar distri- 
butions be made for the last three alphabets. They are as follows: 

Alphabet 3 

ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Alphabet 4 

A;BcBefghijklmnopqrstuvwxyz 

Alphabet 5 



ABCDEFGHIJKLMNOPQRSTUVWXYZ 

e. After but little experiment it is found that the distributions can best be made to fit 
the normal when the following values are assumed: 

' Alphabet 1 Ap=W, 

, Alphabet 2 ,Ap=Ha 

Alphabets Ap=I, 

Alphabet 4 -Ai.=Te 

Alphabet 5 Ap=E, 

y. Note the key word given by the successive equivalents of Apt WHITE. The real proof of 
the correctness of the analysis is, of course, to test the values of the solved alphabets on the 
cryptogram. The five complete cipher alphabets are as follows: 



Pliun. 



Ciphe] 



ABCDEFGHIJKLMNOPQRSTUVWXYZ 

'1 WXYZABCDEFGHIJKLMNOPQRSTUV 

2 HIJKLMNOPQRSTUVWXYZABCDEFG 

3.. ... I JKLMNOPQRSTUVWXYZABCDEFGH 

4.. .., TUVWXYZABCDEFGHIJKLMNOPQRS 

,5...:....— B H i J K L M N a P Q R S T U V W X Y Z A B C D 

FIO0BBS. 
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g. Applying these values to the first few groups of our message, the followinjg found: 

1 2 S 4 5 1 2 8 4 S 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 

Cipher. AUKHY JAMKI ZYMWM JMIGX NFML’X.. . 

Plain.——— E N C 0 U N T E R E D R E D I N P A N T R Y E S T . . . 

h. Intelligible text at once results, and the solution can now be completed very quickly. 

The complete message is as follows: ' , 

ENCOUNTERED RED INFANTRY ESTIMATED AT ONE REGIMENT AND MACHINE GUN COM- 
PANY IN TRUCKS NEAR EMMITSBURG. AM HOLDING MIDDLE CREEK NEAR HILL 543 SOUTH- 
WEST OF FAIRPLAY. WHEN FORCED BACK WILL CONTINUE DELAYING REDS AT MARSH 
CREEK. HAVE DESTROYED BRIDGES ON MIDDLE CREEK BETWEEN EMMITSBURG^TANEJYTOWN 
ROAD AND RHODES MILL. 

i. In the foregoing example (which is typical of the system erroneously attributed, in cryp- 
tographic literature, to the French cryptographer Vigendre, although to do him justice, he 
made no claim of having “invented” it), direct standard alphabets were used, bul; it is obvious 
that reversed standard alphabets may be used and the solution accomplished in the same 
maimer. In fact, the now obsolete cipher disk used by the United States Army for a number 
of years yields exactly this type of cipher, which is also known in the literature iis the Beaufort 
Cipher, and by other names. In fittii^ the isolated frequency distributions to the normal, the 
direction of “reading” the crests and troughs is merely reversed. 

14.' Solution by completing the plain-component sequence. — a. There is another method 
of solving this type of cipher, which is worthwhile explaining, because the underlying principles 
will be found useful in many cases. It is a modification of the method of solution by coiiipleting 
the plain-component sequence, already explained in MUiiary Cryptanalysis, Part I. 

h. After all, the individual alphabets of a cipher such as the one just solved are merely 
direct standard alphabets. It has been seen that monoalphabetic ciphers in wbich standard 
cipher alphabets are employed may be solved almost mechanically by completing the plain- 
component sequence. The plain text reappears on only one generatrix and this generatrix is the 
same for the whole message. It is easy to pick this generatrix out of all the other generatrices 
because it is the only one which yields intelligible text. Is it not apparent that if the same process 
is applied to the cipher letters of the individml alphabets of the cipher just solved that the plain- 
text equivalents of these letters must all reappear on one and the same generatrix? But hotv 
will the generatrix which actually contains the plain-text letters be distinguishable from the 
other generatrices, since these plain-text letters are not consecutive letters in the plain text but 
Only letters separated from one another by a constant interval? The answer is simple. The plain- 
text generatrix should be distinguishable from the others because it wiU show more and a better 
assortment oj high-frequency letters, and can thus be selected by the eye from the whole set oj genera- 
trices. If this is done with all the alphabets in the cryptogram, it will merely be necessary to 
assemble the letters of the thus selected generatrices in proper order, and the result sould be 
consecutive letters forming inteUigible text. 

c. An example will serve to make the process clear. Let the same message be used as before. 
Factoring showed that it involves five alphabets. Let the first ten cipher letters in each alphabet 
be set down in a horizontal line and let the normal alphabet sequences be completed. Thus: 




I 



20 



1 

2 

3 

4 

5 

6 

7 

8 
g 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 
26 
26 



Aubabbt 1 

AJZJNEZAIJ 

BKAKOFABJK 

CLBLPGBCKL 

DMCMQHCDLM 

ENDNRIDEMM 

FOEOSJEFNO 

GPPPTKFGOP 

HQGQULGHPQ 

IRHRVMHIQH 

JSISVNIJRS 

KTJTXOJKST 

LUKUYPKLTU 

UVLVZQLMUV 

NffMWARMNW 

OXNXBSNOnX 

PYOYCTOPXY 

QZPZDUPQYZ 

RAQAEVQRZA 

SBRBFWRSAB 

TCSCGXSTBC 

UDTDHYTUCD 

VEUEIZUVDE 

WFVFJAVWEF 

XGWGKBWXFQ 

YHXHLCXYGH 

ZIYIUDYZHI 



AURABBta 






VBZNGUIZML 

ffCAOHVJANM 

XDBPIWKBON 

YECQJXLCPO 

ZFDRKYMDQP 

AGESLZNERQ 

BHFTMAOFSR 

CIGUNBPGTS 

DJHVOCQHUT 

EKIWPDRIVU 

FUXQESJWV 

GMKYRFTKXW 

HNLZSGULYX 

lOMATHVMZY 

JPNBUIWNAZ 

KQOCVJXOBA 

LRPDWKYPCB 

MSQEXLZQDC 

NTRFYMARED 

OUSGZNBSFE 

PVTHAOCTGF 

QWUIBPDUHG 

RXVJCQEVIH 

SYWKDRFWJI 

TZXLESGXKJ 



AlfhabbtS 

KMMIMIBMVU 

LNNJNJCNWV 

MOOKOKDOXW 

NPPLPLEPYX 

OQQMQMFQZY 

PRRNRNGRAZ 

QSSOSOHSBA 

RTTPTPITCB 

SUUQUQJUDC 

TWRVRKVED 

UWWSWSLWFE 

VXXTXTMXGF 

WYYUYUNYHG 

XZZVZVOZIH 

YAAWAWPAJI 

ZBBXBXQBKJ 

ACCYCYRCLK 

BDDZDZSDML 

CEEAEATENM 

DFFBFBUFON 

EGGCGCVGPO 

FHHDHDWHQP 

GIIEIEXIRQ 

HJJFJFYJSR 

IKKGKGZKTS 

JLLHLHALUT 

Fiovbb S. 



Althabbt 4 

HKWGLMHZMT 

ILXHMNIANU 

JMYINOJBOV 

KNZJOPKCPW 

LOAKPQLDQX 

MPBLQRMERY 

NQCURSNFSZ 

ORDNSTOGTA 

PSEOTUPHUB 

QTFPUVQIVC 

RUGQVWRJWD 

SVHRWXSKXE 

TffISXYTLYF 

UXJTYZUMZG 

VYKUZAVNAH 

WZLVABWOBI 

XAMHVBCXPCJ 

YBNXCDYQDK 

ZCOYDEZREL 

ADPZEFASFM 

BEQAFGBTGN 

CFRCGHCUHO 

DGSCHIDVIP 

EHTDIJEWJQ 

FIUEJKFXKR 

GJVFKLGYLS 



AubabbiS 

YIMXXIRMEG 

ZJNYYJSNFH 

AKOZZKTOGI 

BLPAALUPHJ 

CMQBBMVQIK 

DNRCCNWRJL 

EOSDDOXSKH 

FPTEEPYTLN 

GQUFFQZUUO 

HRVGGRAVNP 

ISWHHSBWOQ 

JTXIITCXPR 

KUYJJUDYQS 

LVZKKVEZRT 

MWALLWFASU 

NXBUMXGBTV 

OYCNNYHCUW 

PZDOOZIDVX 

QAEPPAJEWY 

RBFQQBKFXZ 

SCGRRCLGYA 

TDHSSDMHZB 

WEITTBSTAg 

VFJUUFOJBD 

WGKWGPKCE 

XHLWWHQLDF 



d. If the high-frequency generatrices underlined in Figure 3 are selected and their letters 
are juattaposed in columns the consecutive letters of intelligible plain text inunediately present 



themselves. Thus: 

For Alphabet 1, generatrix 5 ENDNRIDEliK 

For Alphabet 2, generatrix 20. NTRFYMARED 

Selected Generatrices' For Alphabet 3, generatrix 19 CEEAEATENM 

For Alphabet 4, generatrix 8 ORDNSTOGTA 

^For Alphabet 6, generatrix 23 UEITTENIAC 



1 2 3 4 5 
E N C 0 U 
N T E R E 
D R E D I 
N F A N T 

Colunmar juxtaposition of letters I R Y E S T 

from selected generatrices I M A T E 

D A T 0 N 
E R E G I 
M E N T A 
,N D M A C 



nOBBBi. 
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Plain text: ENCOUNTERED RED INFAirrRY ESTIMATED AT ONE 
REGIMENT AND MAC ... . 

e. Solution by this method can thus be achieved without the compilation of any frequency 
tables whatever and is veiy quickly attained. The inexperienced cryptanalyst may have diffi- 
culty at first in selecting the. generatrices which contain the most and the best assortment of 
high-frequency letters, but with increased practice, a high degree of proficiency is attained. 
After all it is only a matter of experiment, trial, and error to select and assemble the proper 
generatrices so as to produce intelligible text. > 

y. If the letters on the sliding strips were accompanied by numbers representii^ their relative 
frequencies in plain text, and these numbers were added across each generatrix, then that gen- 
eratrix with the highest total frequency would theordicaUy always be the plain-text generatrix. 
Practically it will be among the generatrices which show the first three or four greatest totals. 
Thus, an entirely mathematical solution for this type of cipher may be applied. 

g. If tbie cipher alphabets ^ reversed standard alphabets, it is only necessary to convert 
the cipher letters of each isolated alphabet into their normal, plain-component equival^ts and 
then proceed as in the case of direct standard alphabets. 

h. It has been seen how the key word may be discovered in this type of cryptogram. Usually 
the key iamade up of those letters in the successive alphabets whose equivalents are Ap but other 
conventions are of course possible. Sometimes a key number is used, such as 8-4-7-1-12, 
which means merely that Ap is represented by the ei^th letter from A (in the nonpal alphabet) 
in the first ciidier alphabet, by the fourth letter from A in the second cipher alphabet, and so on. 
This modification is known in the literature as tiie Gronsfeld cipher. However, the method of 
solution as illustrated above, being independent of the nature of the key, is the same as before. 

15. Solution by the "probable-word method." — a. The common use of key words in ciypr 
tograms such as the for^;oing makes possible a method of solution that is simple and can be used 
where the more detmled method of analysis using frequency distributions or by completiog the 
plain-component sequence is of no avail. In the case of a very short message which may show 
no recurrences and give no indications as to the number of alphabets involved, this modified 
method will be found most useful. 

b. Briefiy, the method consists in assuming the presence of a probable word in the message, 
and referring to the alphabets to find the key letters applicable when this hypothetical word is 
assumed to be present in various positions in the cipher text. If the assumed word happens to 
be correct, and is placed in the correct position in the message, the key letters produced by 
referring to the alphabets will yield the key word. In the following example it is assumed that 
reversed standard alphabets are known to be used by the enemy. 

Message 

MDSTJ LQCXC KZASA NYYKO LP 

e. Extraneous circumstances lead to the assumption of the presence of the word AMMU- 
NITION. One may assume that this word begins the message. Using sliding normal compo- 
nents, one reversed, the other direct, the key letters are ascertained by noting what the successive 
equivalents of A» are. Thus: 



Cipher—. MDSTJLQCXC 

Plain text- AMMUNITION 

“Key” MPENWTJKLP 
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The key does not spell any intelligible word. One therefore shifts the assumed word one letter 
forward and another trial is made. 



Cipher-—-. DSTJLQCXCK 

Plain text. AMMUNITI ON 

“Key”.— — - DEFDYYVFQX 



This also yields no intelligible key word. . One continues to shift the . assumed word forward 
one space at a time until the following point is reached. 

.LQCXCKZASA 
.AMMUNITION 
.LCORPSSIGN 

The key now becomes evident. It is a cyclic permutation of SIGNAL CORPS. It should be 
clear that since the key word or key phrase repeats itself during the encipherment of such a 
message, the plain-text word upon whose assumed presence in the message this test is being 
based may begin to be enciphered at any point iii the key, and continue over into its next repeti- 
tion if it is longer than the key. When this is the case it is merely necessary to shift the latter 
part of the sequence of key letters to the first part, as in the case noted: LCORPSSIGN is trans- 
posed into SIGN ... LCORPS, and thus SIGNAL CORPS. 

d. It will be seen in the foregoing method of solution that the length of the key is of no 
particular interest or consequence in the steps taken in effecting the solution. The determina- 
tion of the length and elements of the key comes after the solution rather than before it. In this 
case the length of the period is seen to be eleven, corresponding to the length of the key (SIGNAL 
CORPS). 

e. The foregoing method is one of the other methods of determining the length of the key 
(besides factoring), referred to in Far. lOe. 

/. If the assumptiou of reversed standard alphabets yields no good results, then direct 
standaiid alphabets are assumed and the test made exactly in the same manner. . As mfi be 
shown subsequently, the method can also be used as a last resort when mixed alphabets are 
employed. 

g. When the assumed word is longer than the key, the sequence of recovei^ key letters will 
show a periodicity equal to the length of the key; that is, after a certain number of letters the 
sequence of key letters will repeat. This phenomenon would be most useful in the case of keys 
that are not intelligible words but are composed of random letters or figures. Of course, if such 
a key is longer than the assumed word, this method is of no avail. 

h. This method of solution by searching for a word is contingent upon the following cir- 
cumstances: 

(1) That the word whose presence is assumed actually occurs in the message, is properly 
spelled, and correctly enciphered. 

(2) That the sliding components (or equivalent cipher disks or squares) employed in the 
search for the assumed word are actuidly the ones which were employed in the encipheraheht, 
or are such as to give identical results as ^e ones which were actually used. 

(3) That the pair of enciphering equations used in the test is actually the pair which: was 
employed in the encipherment; or if a cipher square is used in the test, the method of finding 
equivalents gives results that correspond with those actually obtained in the encipherment. 
(See par. 9.) 



Cipher. 

Plain text.„ 
“Key” 
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{. The foregoing appears to be quite an array of contingencies and the student may think 
that on this account the method will often fail. But examining these contingencies one by one, 
it will be seen that successful application of the method may not be at all rare — after the solution 
of some messages has disclosed what sort of paraphernalia and methods of employing them are 
favored by the enemy. From the foregoing remark it is to be inferred that the probable-word 
method has its great^t usefulness not in an initial solution of a system, but only after successful 
study of eneihy commuiucations liy more difficult processes ctf analysis has ‘told its story to the 
alert cryptanalyst. Although it is commonly attributed to Bazeries, the, French cryptanalyst 
of 1900, the probable-word method is very old in cryptanalysis and goes back several centuries. 
Its usefulness in practical work may best be indicated by quoting from a competent observer 

There is another [method] which is to this first method what the gebmetrle method is to analysis in certain 
Spiences, and, according to the whims of individuals, certain cryptanalysts prefer one to the other. Certain others, 
incapable of getting the answer with one of the methods in the solution of a difficult problem, conquer it by means 
of the other, with a disconcerting masterly stroke. This other method is that of the probable word. We inay 
have more or less definite opinions concerning the subject of the cryptogram. We may know something about its 
date, and the correspondents, who may have been indiacreet in the subject they have treated. On this basis, the 
h;^thesis is made that a certain word probably appears in the text. ... In certain classes of documents, 
military or diplomatic telegrams, bankiiig and mining affairs; etc.,' it is not impossible to inake very important 
assumptions about tiie presence of certain words in the text. After a cryptanalyst has worked for a long time 
with the writi^ of certain correspondeiits, he gets used to their expressions. He gets a whole load of words 
to try out; thra the changes of key, and sometimes of system, no longer throw into his way the difficulties of an 
absolutely new study, which might require the analytical method. 

> Givierge, M.; Court do Cryptographie, Paris, 1926, p. 30. 
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BEPEATING-KEY SYSTEMS WITH MIXED CIPHEB ALPHABETS, I 
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Reason for the use of mixed alphabets 16 

Interrelated mixed alphabets 17 

Prinoildes of direct symmetry of position l 18 

Initial steps in the solution of a typical example — — 19 

Application of principles of direct symmetry of position — 20 

Subsequent steps in solution 21 

Completing the solution 22 

Solution of subsequent messages enciphered bj same cipher component 23 

Summation of relative frequencies as an aid to the sele^ion of the correct generatriees 24 

Solution by the probable-word method 26 

Solution when plain component is mixed, the cipher component, the normal, 26 



16. Reason for the use of mixed alphabets. — a. It has been seen in the examples eonsidered 
thus far that the use of several alphabets in the same message does not greatly complicate the 
ahalyas of such a cryptogram. There are three reasons why this is so. firstly, only relatively 
few alphabets were employed; secondly, these alphabets were employed in a periodic or repeating 
manner, giving rise to cyclic phenomena in the cryptogram, by means of which the number of 
alphabets could be determined; and, thirdly, the cipher alphabets were known alphabets, by 
which is meant merely that the sequences of letters in both components of the cipher alphabets 
were known sequences. 

h. In the case of monoalphabetic ciphers it was found that the use of a mixed alphabet 
delayed the solution to a considerable d^ree, and it will now be seen that the use of mixed alpha- 
bets in polyalphabetic ciphers renders the analysis much more difficult tiian the use of standard 
alphabets, but the solution is still fairly easy to achieve. 

17. ^terrelated mixed alphabets. — a. It was stated in Par. 6 that the method of producing 
the mixed alphabets in a polyalphabetic cipher often affords clues which are of great assistance 
in the analysis of the cipher alphabets. This is so, of course, only when the cipher alphabets 
are interrelated secondary alphabets produced by sliding components or their equivalents. 
Beference is now made to the classification set forth in Par. 6, in connection with the types of 
alphabets which may be employed in polyalphabetic substitution. It will be seen that tiius far 
only Cases A (1) and (2) have been treated. Case B (1) will now be discussed. 

h. Here one of the components, the plain component, is the normal sequence, while the 
cipher component is a mixed sequence, the various juxtapositions of the two components yielding 
mixed alphabets. The mixed component may be a i^tematically-mixed or a random-mixed 
sequence. If the 25 successive displacements of the mixed component are recorded in separate 
lines, a synunetrical cipher square such as that shown in Fig. 5 results therefrom. It is identical 
in form with the square table shown on p. 7, labeled Table I-A. 

( 24 ) 
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. ABCDEFGHIJKLMNOPQRSTOVWXYZ 
LEAVNVORTHBGDFGIJKUPQSUXYZ 
EAVNWORTHBCDFGIJKMPQSUXYZL 
AVMWORTHBCDFGIJKMPQSUXYZLE 
VNWORTHBCDFGIJKHPQSUXYZLEA 
MWORTHBCDFGIJKMPQSUXYZLEAV 
WORTHBCDFGIJKMPQSUXYZLEAVN 
ORTHBCDFGIJKHPQSUXYZLEAVNW 
RTHBCDFGIJKMPQSUXYZLEAVNWO 
THBCDFGIJKMPQSUXYZLEAVMWOR 
HBCDFGIJKUPQSUXYZLEAVNWORT 
BCDFGIJKMPQSUXYZLEAVNWORTH 
CDFGIJKMPQSUXYZ L E AYNWORTHB 
D F G I J K H P Q S U X Y Z L E A V N W 0 R T H B C 
F G I J K II P Q S U X Y Z L E A V M W 0 R T H B C D 
GIJKUPQSUXYZLEAVNWORTHBCDF 
IJKHPQSUXYZLEAVNWORTHBCDFG 
JKMPQSUXYZLEAVNWORTHBCDFGI 
KUPQSUXYZLEAVNWORTHBCDFGIJ 
liPQSUXYZLEAVNVORTHBCDFGIJK 
PQSUXYZLEAVNWORTHBCDFGIJKM 
QSUXYZLEAVNRORTHBCDFGIJKMP 
SUXYZLEAVNWORTHBCDFGIJKMPQ 
UXYZLEAVNWORTHBCDFGIJKliPQS 
XYZLEAVNWORTHBCDFGIJKIIPQSU 
YZLEAVNWORTHBCDFGIJKMPQSUX 
ZLEAVNWORTHBCDFGIJKMPQSUXY 

Ttavnn t. 

e. Such a cipher square be used in exactly the same manner as the Vigen^re square. 
With the key word BLUE and conforming to the normal enciphering equations (6k/t=0|/i; 0,/i= 
0,/j), the following lines of the square would be used: 

ABCDEFGHIJKLMWOPQRSTUVWXYZ 

BCDFGIJKMPQSUXYZLEAVNWORTH 

LEAVNWORTHBCDFGIJKMPQSUXYZ 

UXYZLEAVNWORTHBCDFGIJKMPQS 

EAVNWORTHBCDFGIJKMPQSUXYZL 

Fisvuta. 

These lines would, of course, yield the following cipher alphabets: 

. . Plain. ABCDEFGHIJKLMNOPQRSTUVWXYZ 

^ ^ Cipher. BCDFGIJKMPQSUXYZLEAVNWORTH 

. . Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

^ * Cipher. LEAVNWORTHBCDFGIJKMPQSUXYZ 

. . Pl^... ABCDEFGHIJKLMNOPQRSTUVWXYZ 

^ ^ apher. UXYZLEAVNWORTHBCDFGIJKMPQS 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

' ^ Cipher. EAVNWORTHBCDFGIJKMPQSUXYZL 

xtoxni n. 



I- 




Plain. 



Cipher. 



: iH 
i lil 
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18. Principles of direct symmetry of position.^-«. It: was stated directly above that Fig. 5 
is a symmetrical cipher square, by which is meant that the letters in its .successive horizontal 
linAs show a symmetry of position with respect to one another. . They constitute, in reality, one 
and only one sequence or series of letters, the sequences being merely displaced successively 1, 
2, 3, . . . intervals. The symmetry exhibited is obvious and is said to be visible, or direct. 
This fact can be used to good advantage, as has already been alluded to in par. 7j. 

b. Consider, for example, the pair of letters G, and y, in cipher alphabet (1) of Fig. 6b. The 
letter V, is the 15th letter to the right of G.. In cipher dphabet (2), V, is also the 15th letter to 
the right of G., ^ is the case in each of tiie four cipher dphabets in Fig. 6b, since the relative 
positions they occupy are the same in each horizontal line in Fig. 6a, that is, in each of the suc- 
cessive recordings of the cipher component as the. latter is slid to the right against the plain or 
normal component. If, therefore, the relative positions octmpied by two letters, 6i and 63, in 
such a cipher alphabet, Ci, are known, and if the position of Oi in another cipher alphabet, Cj, 
belonging to the same series is known, then 9j may at once be placed into its correct position in Cj. 
Suppose, for example, that as the result of an analysis based upon qonsiderations of frequency, 
the following values in four cipher alphabets have been tentatively determined: 

Plain... ABGDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher......... G Y ’ Y 

... Plain... ABGDEFGHIJKLMNOPQRSTUVWXYZ 

apher......... N G P 

... Plain ABGDEFGHIJKLMNOPQRS.TUVWXYZ 

apher. L B I 

... PlMii ■■■■ ABGDEFGHIJKLMNOPQRSTUVWXYZ 
apher W I Q 

Fjqdbb 7a. 

e. The. cipher components of these four secondary alphabets may, for convenience, be assem- 
bled into a cellular structure, hereinafter called a sequence reconstruction skdeton, as showm in 
Fig. 76. Begarding the top line of the reconstruction skeleton in Fig. 76 as bejing common to all 
four secondary cipher alphabets listed in Fig. 7a, the successive lines of the recopstructiion skeleton 
may how be termed cipher alphabets, and may be referred to by the humbem at the left. 



Plain 
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B 
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E 


F 


G 


H 


I 


J 
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L 


M 


N 


0 


p 


,Q 


A 


s 


T 


y 


V 


w 


X 


Y 


Z 




apher' 
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G 




















Y 






1 ' 




V 














2 
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■p 
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4 _ __ 
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Q 



















Fioratl 76. 



d. The letter G is common to Alphabets 1 and 2. In Alphabet 2 it is noted that N occupies 
the 10th position to the left of G, and the letter P occupies the 5th- position to the right of G. 
One may therefore place these letters, N and P, in their proper positions in Alphabet 1, the letter N 
being placed 10 letters before G, and the letter P, 5 letters after G. Thuk 



Plain ; 


A 


B 


G 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


S 


T 


U 


V, 


w 


X 


Y 


Z 


l._...... 










G 










P 










Y 










V 


N 




j 
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Thus-, the values of two new letters iu- Alphabet 1, viz, Pe= Jp, aud N.^Up have be^ automati- 
cally determined; these values were obtained without any analysis based upon the Jrequeiity ol 
Po and Ng. Likewise, in Alphabet 2, the letters Y and V may be inserted in these positions: 



Plain 


A 


B C. 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


S 


T 


U 


V 


mr 


X 


Y 


Z 


2 






V 


N 




















G 










P 










Y 





This gives the new values Vg=Dp and Ye=Yp,ih Alphabet 2. Alphabets 3 and 4 have a common 
letter I, which permits of the placement of Q and W in Alphabet 3, and of B and L in Alphabet 4. 

e. The new values thus found are of course immediately inserted throughout the crypto- 
gram, thus leading to the assumption of further values in the cipher text. This process, viz, the 
reconstruction of the primary components, by the application of the principles of direct symmetry 
of position to the cells of the reconstruction skeleton, thus facilitates and hastens solution. 

f. It must be clearly understood that before the principlea of direct symmetry of position 
can be applied in cases such as the foregoing, it is necessary that the plain component he a known 
sequence. Whether it is the normal sequence or not is immaterial, so long as the sequence is 
known. Obviously, if the sequence is unknown, symmetry, even if present, cannot be detected 
by the 'cryptanalyst -because he has no hose upon which to try out his assumptioiis for 
symmetry. In other words, direct symmetry of position is manifested in the dlustrative 
ezlElmple because - the plahi component is a known sequence, and not because it is the 
normal alphabet. The significance of this point will become .apparent' later on in coimeotion 
with the problem discussed m Par. 26&. 

19. Initial steps in the solution of a typical example. — a. In the light of the foregoing prin- 
ciples let a typical message now be studied. 

Message 

12 3 4 6 



A. 


0 W B R I 


V W Y C A 


I S P J L 


R B Z E y 


Q W Y E U 


B. 


L W M G W_ 


I C J C I 


M T Z E I 


, M j B K N 


0[ W B R I 




V W Y I G 


B W N B Q 


Q C G Q H 


I W J K A 


_G E G X N 


D. 


I D H R U 


V E Z Y G 


Q I G V N 


C T G Y 0 


B P D B L 


E. 


V C G X G 


B K Z Z G 


I V X c u 


K T Z AO 


B W F E Q 


F-; 


Q L ,F C 0 


M T Y Z T 


C C B Y Q 


0 P D K A 


_G D G I G 


G. 


V P W M R 


Q I I E ^ 


I C G X G 


L G Q Q 


V B G R S 


H. 


M Y J J Y 


Q V F W Y 


R W N F L 


G X N F W 


M C J K X 


J. 


I P D R U 


0 P J Q Q 


Z R H C N 


V W D Y Q 


R D G D G .. 


K. 


B X D.B M 


P X F P U 


Y X N F G 


M .P J E. L 


S A N G D 


L. 


S E Z :Z G 


I BEY U 


K D H C A. 


M B J J F 


K I L C J 


M. 


M F D Z T 


_C T J R D 


M I Y Z Q 


A C J R R 


S B G Z N 


N. 


Q y a,h Q 


V E DC Q 


L X N C L 


L y V C S 


q W B I I 


P. 


I V JR N 


W N B R I 


V P J E L 


T 'a G D N 


I R G Q P 


Q. 


A T Y E W 


C B Y Z T 


E V G Q U 


V P Y H L 


L R Z N Q 


R. 


X I N B A 


I K W J Q 


R D Z Y F 


K W F Z L 


G W F J Q 


S. 


Q W J Y Q 


I B W R X 
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b. The principal repetitiona of three or more letters have been underlined in the message and 
the factors (up to 20 only) of the intervals between them are as follows: 

QWBRIVWY. 45=3,5,9,15. 

CGXGB. 60=2, 3, 4, 5, 6, 10, 12. 15, 20. 

Pja. 95=5, 19. 

ZZGI 145=5. 

BRIV- 285=3, 5, 15, 19. 

BRI 45=3,5,9,15. 

KAG 75=3, 5, 15. 

QRD. 165=3, 5, 15. 

QWB 45=3, 5, 9, 15. 

QWB 275=5, 11. 

WIC. -130=2,5,10,13. 

XNP. 45=3,5,9,15. 

YZT. 225=3, 5, 15. 

ZTC. 145=3, 5. 

The factor 5 is common to all of these repetitions, and there seems to be every indiestioii that 
five alphabets are involved. Since the message already appears in groups of five letters, it is 
unnecessary in this case to rewrite it in groups corresponding to the length of the key. The 
uniliteral frequency distribution for Alphabet 1 b as follows: 

S i i g 

ABCDEF6HIJKLHN0PQRSTUVWXYZ 

Itevn 8. 

c. Attempts to fit this dbtribution to the normal on the baris of a direct or reversed standard 
alphabet do not pve positive results, and it b assiuned that mixed alphabets are involved. 
Individual triliteral frequency dbtriburions are then compiled and are shown in Fig. 9. These 
tables are omilar to those made for ringb mixed alphabet ciphers, and are made in the same 
way except that instead of taking the letters one after the other, the letters which belong to the 
separate alphabets now must be assembbd in separate tables. For example, in Alphabet 1, 
the trigraph QAC means that A occurs in Alphabet 1 ; Q, its prefix, occurs in Alphabet 5, and C, its 

sufiSx, occurs in Alphabet 2. All confusicm may be avoided by placing numbers indicating the 

»ia 

alphabets in which they belong above the letters, thus: QAC 



Alphabet 1 



ABC 


D E 


F G 


H X J 


R L M 


HOP 


Q R S T 


U V N X T Z 


QG car NT 


TV 


AE 


AS 


UD UN XT 


UT QP MX 


>N LB LA LA 


IN MM V UX OR 


PT OP TC 




AD 


NO 


FI QX II 


UP 


YN YN DE 


IN 


cac TT 




UC 


HI 


FN LV OT 




MN QD RB 


m 


ON NB 




IM 


«D 


IS sr 




QC QD 


LC 


GL 






GV 


NO 




GI 


GP 


GX 






NO 


GP 




QL 


QB 








XD 


AB 




RI 


NN 








GB 


JF 




YV 


QE 








IV 


DI 




MY 


IP 








MR 






SN 


UP 








AK 






ON 
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AxjnuMu S 



ABODE 


F G 


H I J K L 


U N 


0 P Q 


R 


8 


T 


U V W X T 


SN RZ IJ HI GG 


PT^HH 


UB IV QF 


WB 


BD 


ZH 


IP 


HZ 


IX QB GN UJ 


TG VG QG GG VZ 




QG BZ BG 




OD 


IG 




CG 


QF VY BD QA 


IE VG ID SZ 




QI 




VW 


LZ 




NZ 


LV QY PF 


UJ CB RG VD 




KL 




OJ 






UT 


IJ LM YN 


SG IG KH 




MY 




HJ 






CJ 


EG QB LN 


CT lU RZ 




XN 




VJ 






AY 


VY 


IW AJ 








VY 








BN 



IJ 

BF 

RN 

VD 

QB 

KF 

GF 

QJ 



Alphabet 3 



A B 


C D 


E F G H 


I J K 


L 


UNO 


P Q R S 


T U V W X Y Z 


YH WR 


PB BY WE CQ RC 


IE CC 


IC 


VG WB 


SJ 


VC ni VC wc BE 


IK 


PK 


LC EX DC 


WK 




DR WF 




KJ WE TE 


WR 


DR 


VW IV 


YJ 




XF 




BR WI EY 


CY 


WY 


XP TY 


CK 




XF 




TZ KZ 


VI 


XB 


WZ CX 


PQ 




AC 




IZ TA 


NR 


FZ 


WJ DI 


PE 




XC 




TE EZ 




EC 


CX 


BJ 




IB 




BZ RN 






LQ 


TR 








PH DY 






BR 


CR 














DD 


VR 














BZ 


PE 














AD 


WY 














RQ 
















VQ 













Alphabet 4 



ABC 


DBFGHIJKL 


U N 0 


P Q R S 


T U V W X Y Z 


ZO NQ YA GG ZY NL UW AQ YG PL BN 


WR zq 


FU GH BI 


GN FY GN ZG ZG 


DL JI 


GN YU NW YL GG JY JA 




GQ BI 


GG GO YT 


DN XU 


ZI NG BI JF DA 




JQ UU 


GG BQ ZG 


NA FO 


FQ WQ JX 




GP GS 


DQ DT 


HN 


IW FQ 




GU DU 


EU YQ 


ND 


JL 




JD 


ZF GN 


HA 


JL 




JR 


JQ YT 


LJ 


YW 




JN 


FL 


DQ 






BI 




NL 






WX 





VS 



152018—38 3 
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Alphabet 6 



A B 


C D 


E F 


G H 


I 


J 


K L 


M N 0 P Q R 


S T U 


V W X Y 


Cl 


CS 


JK 


IB QI 


RV 


CM 


JR 


KQ YB QA BQ MQ RM ZC EL 


GI KI EQ 


KG 


RM 


YK 


YQ 


CM 




BV 


XI AB EQ RS CQ ZC RV 


El R- JQ 


KG 






XB 


EM 




FG 


VC CM YO 


ZE CN 


FM VR 


CM 






ZI 


RV 




ES 


CV QV 


RO 


EC 


BI 






IV 


II 




CL 


BP QZ 


PY 










XB 


RV 




ET 


ZCi YR 


YK 










DB 






HL 


RW ZA 


QV 










FH 






ZG 


DI HV 












ZI 








CL 







NX 

JR 

JQ 

YI 



Condensed table of repetitions 



1-2-3-4-5-1-2-3 


1-2-3 


1-2 


Q W B R I V W Y-2 


Q W B-3 


Q W-5 




V W Y-2 


V P-3 


2-3-4-5-1 




V W-3 


C G X G B-2 


2-3-4 






C G X-2 


2-3 


2-3-4-1 


P J E-2 


C G-3 


P J E L-2 


W B R-2 


C J-3 




X N F-2 


P J-3 


3_4_5_1 




W B-3 


B-R-I-V 


3-4-5 


W F-3 


Z-Z-G-I-2 


B R 1-3 


W Y-3 




G X G-2 
J E L-2 


X N-3 




Y Z T-2 


3-4 




Z Z G-2 


B R-3 
G Q-4 




4-5-1 


G X-3 




K A G-2 


J R-3 




X G B-2 


N F-3 




Z G 1-2 
Z T C-2 


Y Z-3 




R I V-3 


4-5 
R 1-3 




5-1-2 


Y 0-3 




I V W-2 
Q R D-2 


Z T-3 




W I C-2 


5-1 
G B-4 
I V-3 
Q Q-3 



Fiqvbb 9. 
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d. One now proceeds to analyze each alphabet distribution, in an endeavor to establish 
identifications of cipher equivalents. First, of course, attempts should be made to separate 
the vowels from the consonants in each alphabet, uaing the same test as in the case of a single 
mixed-alphabet cipher. There seems to be no doubt about the equivalent of in each alphabet: 

1 2 3 4 6 

E=Ia , Wo , G, , Ce , Q, 

c. The letters of greatest frequency in Alphabet 1 are I, M, Q, V, B, 6, L, R, S, and C. I, 

2 6 

has already been assumed to be Ep. If W, and Qe=Ep, then one should be able to distinguish the 
vowels from the consonants among the letters M, Q, V, B, G, L, R, S, and C by exaroining the 

2 5 

prefixes of Wo, and the suffixes of Q,. The prefixes and suffixes of these letters, as shown by the 
tnliteral frequency distributions, are these: 



Prefixes of Wo (=Ep) 



6 6 

Suffixes of Qo (=Ep) 



QGKVRBIL 



IQRXLVAZO 



/. Consider now the letter M,; it does not occur either as a prefix of W„ or as a s uffix of Qo. 

Hence it is most probably a vowel, and on account of its high frequency it may be assumed to 

1 2 

be Op. On the other hand, note that Qo occurs five times as a prefix of Wo and three times as 
6 

a s uffix of Qo. It is therefore a consonant, most probably Rp, for it would give the digraph 
61 12 
ER (=QQo) as occurring three times and RE (=QW„) as occurring five times. 

1 2 6 

g. The letter Vo occurs three times as a prefix of W« and twice as a suffix of Qo. It is there- 

1 

fore a consonant, and on accoimt of its frequency, let it be assumed to be Tp. The letter B, 

2 6 

occurs twice as a prefix of Wo but not as a suffix of Qo. Its frequency is only medium, and it is 

1 2 

probably a consonant. In fact, the twice repeated digraph BWo is once a part of the trigraph 
6 12 6 

GBW, and Go, the letter of second highest frequency in Alphabet 5, looks excellent for Tp. Might 
612 

not the trigraph GBW be THE? It will be well to keep this possibility in mind. 

12 6 

h. The letter G® occurs only once as a prefix of Wo and does not occur as a suffix of Qo. It may 

1 2 

be a vowel, but one can not be sure. The letter Lg occurs once as a prefix of Wo and once as a 
6 1 2 _ 
suffix of Qo. It may be considered to be a consonant. Ro occurs once as a prefix of Wo, and twice 
6 . .11 
as a suffix of Qo, and is certainly a consonant. Neither the letter Sg nor the letter Cg occurs as a 
2 6 

prefix of W, or as a suffix of Qgj both would seem to be vowels, but a study of the prefixes and 

1 1 

suffixes of these letters lends more weight to the assumption that C, is a vowel than that Sg is a 

6 6 6 . 

vowel. For all the prefixes of C, viz, N, T, and W, are in subsequent analysis of Alphabet 5 classi- 
fied as consonants, as are likewise its suffixes, viz, T, C, and B in Alphabet 2. On the other hand, 
6 2 1 . 

only one prefix, Lo, and one suffix. Bo, of S', are later classified as consonants. Since vowels are 
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1 

more often associated with consonants than with other vowels, it would seem that C, is more 

11 , ,1 
likely to be a vowel than S,. At any rate C« is assumed to be a vowel, for the present, leaving S« 

unclassified. 

i. Going through the same steps with the remaining alphabets, the following results are 
obtained: 



Alpbabet 


Consonants 


Vowels 


1 


Q. V. B, L. R. G? 


I. M. C. 


2 


B. C, D, T. 


W. P. I. 


3 


J, N. D, Y. F. 


G, Z. 


4 


Y, Z, J. Q. 


C. E?, R7, B7 


5 


G. N, A, I. W. L, T. 


Q. U. 



20. Application of principles of direct symmetry of position. — a. The next step is to try 
to determine a few values in each alphabet. In Alphabet 1, from the foregoing analysis, the 
following data are on hand: 

Plain. ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher C? I C? M Q V 



Let the values of Ep already assumed in the remaining alphabets, be set down in a reconstruction 
skeleton, as follows: 



Plain... 




A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


S 


T 


U 


V 


W 


X 


Y 


Z 




1 


C? 








I 








C? 












M 






Q 




V 
















2 










W 












































Cipher 


3 










G 














































4.. 










C 














































5 










Q 













































FlOUBB 10. 



&. It is seen that by good fortune the letter Q is common to Alphabets 1 and 5, and the 
letter C is common to Alphabets 1 and 4. If it is assumed that one is dealing with a case in which 
a mixed component is sliding against the normal component, one can apply the principles of 
direct symmetry of portion to these alphabets, as outlined in Par. 18. For example, one may 
insert the foEowing values in Alphabet 5: 



Plain 


A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


1 

L 


M 


N 


0 


P 


Q 


R 


S 


T 


U 


V 


w 


X 


Y 


Z 


Cipher' 


fi 

[5 


C? 








I 








C? 












M 






Q 




V 
















•M 
1 






Q 




V 












! 


C? 


1 






I 








C? 

_ i 











FISUBB 11 . 
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5 5 6 

c. The process at once gives three definite values; Mc=Bp, Vc=Gp, Ie=Rp. Let these de- 
duced values be substantiated by referring to the frequency distribution. Since B and G are 
normally low or medium frequency letters in plain text, one should find that Mp and Vp, their 
hypothetical equivalents in Alphabet 5, should have low frequencies. As a matter of fact, they 
do not appear in this alphabet, which thus far corroborates the assumption. On the other hand, 

6 5 

since Ip=Rp, if the values derived from symmetry of position are correct. Ip should be of high 
frequency, and reference to the distribution shows that Ip is of high frequency. The position of 
C is doubtful; it belongs either under Np or Vp. If the former is correct, then the frequency 

5 

of Cp should be high, for it would equal Np; if the latter is correct, then its frequency should be 

5 

low, for it would equal Vp. As a matter of fact, Cp does not occur, and it must be concluded 

1 

that it belongs under Vp. This in turn settles the value of Cp, for it must now be placed definitely 
under Ip and removed from beneath Ap. 



d. The definite placement of C now permits the insertion of new values in Alphabet 4, and 
one now has the following: 



Plain... 




A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


S 


T 


U 


V 


W 


X 


Y 


Z 




ri 










I 








C 








> 




M 






Q 




V 
















2 .... 










W 












































Cipher 


3 










G 














































4 


I 








C 












M 






Q 




V 
























.5 




M 






Q 




V 






















I 








C 











Fioube 12. 



21. Subsequent steps in solution. — a. It is high time that the thus far deduced values, as 
recorded in the reconstruction skeleton, be inserted in the cipher text, for by this time it must seem 
that the analysis has certainly gone too far upon unproved hypotheses. The following results 
are obtained: 



Message 

1 2 3 4 5 



A. 


Q W B 


R I 


V 


W Y C A 


I S P J L 


R 


B Z 


E Y 


Q W Y E U 




R E 


R 


T 


E E 


E 








R E 


B. 


L W M 


G W 


I 


C J C I 


M T Z E I 


M 


I B 


K N 


Q W B R I 




E 




E 


E R 


0 R 


0 






RE R 


C. 


V W Y 


I G 


B 


W N B Q 


Q C G Q H 


I 


W J 


K A 


G E G X N 




T E 


A 




E E 


R E N 


E 


E 




E 
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D. 


I D M R U 


V 


E 


Z Y G 


Q I G V N 


C 


T 


G Y 0 


B 


P 


D B L 




E 


T 






R E P 


I 




E 








E. 


V C G X G 


B 


K 


Z Z G 


I V X C U 


N 


T 


ZAO 


B 


W 


F E Q 




T E 








E E 










E 


E 


F. 


Q L F C 0 


M 


T 


Y Z T 


C C B Y Q 


0 


P 


D K A 


G 


D 


GIG 




R E 


0 






I E 












E A 


G. 


V P W M R 


Q I 


I E W 


I C G X G 


B 


L G Q Q 


V 


B 


G R S 




T K 


R 






E E 






E N E 


T 




E 


H. 


M Y J J Y 


Q V F W Y 


R W N F L 


G 


X 


N F W 


M 


C 


J K X 




0 


R 






E 








0 






J. 


I D D R U 


0 


P 


J Q Q 


Z R H C N 


V 


w 


D Y Q 


R 


D 


G D G 




E 






N E 


E 


T 


E 


E 






E 


K. 


B X 0 B N 


P X 


FPU 


Y X N F G 


M 


P 


J E L 


S 


A 


N C D 














0 










E 


L. 


S E Z Z G 


I 


B 


E Y U 


K D H C A 


M 


B 


J J F 


K 


I 


L C J 






E 






E 


0 










E 


H. 


M F D Z T 


C 


T J R D 


M I Y Z Q 


A 


C 


J R R 


S 


B 


G Z N 




0 


I 






0 E 












E 


N. 


Q Y A H Q 


V 


E 


D C Q 


L X N C L 


L 


V 


V C S 


Q 


W 


B I I 




R E 


T 




E E 


E 






E 


R_ 


E_ 


A R 


P. 


I V J R N 


W N B R I 


V P J E L 


T A 


G D N 


I 


R 


G Q P 




1 






R 


T 






E 


E 




E N 


Q. 


A T Y E W 


C 


B 


Y Z T 


E V G Q U 


V 


P 


Y H L 


L 


R 


Z N Q 






I 






E N 


T 










E 


R. 


X I N B A 


I 


K 


W J Q 


R D Z Y F 


K 


W 


F Z L 


G 


W 


F J Q 






E 




E 






E 






E 


E 


S. 


Q W J Y Q 


I 


B 


W R X 

















RE EE 
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b. The combinations given are excellent throughout and no inconsistencies appear. Note 

12 3 

the trigraph QWB, which is repeated in the following polygraphs (underlined in the foregoing text): 

123481 8123461 

QWBRIV. . .SQWBIII 

RE RT... RE ARE 

3 

c. The letter B# is common to both polygraphs, and a little imagination will lead to the 

3 

assumption of the value Bo=Pp, yielding the following: 

133461 6123461 

QWBRIV. . .SQWBIII 

REPORT. . .PREPARE 

4 6 1 3 3 4 

d. Note also (in F5) the polygraph I G V P W M, which looks like the word ATTACK. The 

AT K 

frequency distributions are consulted to see whether the frequencies given for G* and P, are hi gh 

3 

enough for Tp and Ap, respectively, and also whether the frequency of W, is good enough for Cp; 

61 

it is noted that they are excellent. Moreover, the digraph GBp, which occins four times, looks 
1 

like TH, thus making B,=Hp. Does the insertion of these four new values in our diagram of 

2 1 

alphabets bring forth any inconsistencies? The insertion of the value P«=Ap and Bp=Hp gives 

no indications either way, since neither letter has yet been located in any of the other alphabets. 

6 

The insertion of the value Ge=Tp gives a value common to Alphabets 3 and 5, for the value 

3 

Gp=Ep was assumed long ago. Unfortunately an inconsistency is found here. The letter I 
has been placed two letters to the left of G in the mixed component, and has given good results 

3 

in Alphabets 1 and 5; if the value Wo=Cp (obtained above from the assumption of the word 
ATTACK) is correct, then W, and not I, should be the second letter to the left of G. Which shall 

3 

be retained? There has been so far nothii^ to establish the value of G„=Ep; this value was 
assumed from frequency considerations solely. Perhaps it is wrong. It certainly behaves like 
a vowel, and one may see what happens when one changes its value to Op. The following 
placements in the reconstruction skeleton result from the analysis, when only two or three new 
values have been added as a result of the clues afforded by the deductions: 




FlOUU 13a. 
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e. Many new 


values are 


produced, and these 


are inserted throughout the message, yielding 


the following: 






















































1 










2 








3 










4 










5 






A. 


Q 


w 


B 


R 


I 


V 


W 


Y 


C A 


’ I 


S 


P 


J 


L 


R 


B 


Z 


E 


Y 


Q 


W 


Y 


E 


U 




R 


E 


P 


0 


R 


T 


E 




E 


E 


M 


Y 






S 


R 








R 


E 








B. 


L 


W 


M 


G 


W 


I 


C 


J 


C I 


M 


T 


Z 


E 


I 


M 


I 


B 


K 


N 


Q 


W 


B 


R 


I 






E 


W 


C 


H 


E 


s 




E R 


0 








R 


0 


0 


P 






R 


E 


P 


0 


R 


C. 


V 


W 


Y 


I 


G 


B 


w 


N 


B Q 


Q 


C 


G 


Q 


H 


I 


w 


J 


K 


A 


G 


E 


G 


X 


N 




T 


E 




A 


T 


H 


E 




D E 


R 


S 


0 


N 




E 


E 








G 




0 






D. 


I 


D 


M 


R 


U 


V 


E 


Z 


Y G 


Q 


I 


G 


V 


N 


C 


T 


G 


Y 


0 


B 


P 


D 


B 


L 




E 




W 


0 




T 






T 


R 


0 


0 


P 




I 




0 






H 


A 




D 




E. 


V 


C 


G 


X 


G 


B 


K 


Z 


Z G 


I 


V 


X 


C 


U 


N 


T 


Z 


A 


0 


B 


W 


F 


E 


Q 




T 


S 


0 




T 


H 






T 


E 


D 




E 














H 


E 






E 


F. 


Q 


L 


F 


C 


0 


M 


T 


Y 


Z T 


C 


C 


B 


Y 


Q 


0 


P 


D 


K 


A 


G 


D 


G 


I 


G 




R 






E 




0 








I 


S 


P 




E 




A 








G 




0 


A 


T 


G. 


V 


P 


W 


M 


R 


Q 


I 


I 


E W 


I 


C 


G 


X 


G 


B 


L 


G 


Q 


Q 


V 


B 


G 


R 


S 




T 


A 


C 


K 


F 


R 


0 


M 


H 


E 


S 


0 




T 


H 




0 


N 


E 


T 


R 


0 


0 


P 


H. 


M 


Y 


J 


J 


Y 


Q 


V 


F 


W Y 


R 


ff 


N 


F 


L 


G 


X 


N 


F 


W 


M 


C 


J 


K 


X 




0 










R 


D 




Q 


S 


E 








G 








H 


0 


S 








J. 


I 


D 


D 


R 


U 


0 


P 


J 


Q Q 


Z 


R 


H 


C 


N 


V 


W 


D 


Y 


Q 


R 


D 


G 


D 


G 




E 






0 






A 




N £ 




C 




E 




T 


E 






E 


S 




0 




T 


K. 


B 


X 


D 


B 


N 


P 


X 


F 


P U 


Y 


X 


N 


F 


G 


M 


P 


J 


E 


L 


s 


A 


N 


C 


D 




H 






D 




Q 






M 










T 


0 


A 








c 






E 




L. 


S 


E 


Z 


Z 


G 


I 


B 


E 


Y U 


K 


D 


H 


C 


A 


M 


B 


J 


J 


F 


K 


I 


L 


C 


J 




C 








T 


E 


R 












E 




0 


R 










0 




E 




M. 


M 


F 


D 


Z 


T 


C 


T 


J 


R D 


M 


I 


Y 


Z 


Q 


A 


C 


J 


R 


R 


S 


B 


G 


Z 


N 




0 










I 






0 


0 


0 






E 




S 




0 


F 


C 


R 


0 






N. 


Q 


Y 


A 


H 


Q 


V 


E 


D 


c Q 


L 


X 


N 


C 


L 


L 


V 


V 


C 


S 


Q 


W 


B 


I 


I 




R 








E 


T 






E E 








E 






D 


B 


E 


P 


R 


E 


P 


A 


R 


P. 


I 


V 


J 


R 


N 


W 


N 


B 


R I 


V 


p 


J 


E 


L 


T 


A 


G 


D 


N 


I 


R 


G 


Q 


P 




E 


D 




0 




U 




P 


0 R 


T 


A 












0 






E 


C 


0 


N 


D 


Q. 


A 


T 


Y 


E 


W 


C 


B 


Y 


Z T 


E 


V 


G 


Q 


U 


V 


P 


Y 


H 


L 


L 


R 


Z 


N 


Q 












H 


I 


R 








D 


0 


N 




T 


A 










C 






E 


R. 


X 


I 


N 


B 


A 


I 


K 


W 


J Q 


R 


D 


Z 


Y 


F 


K 


W 


F 


Z 


L 


G 


W 


F 


J 


Q 






0 




D 




E 






E 


S 












E 








G 


E 






E 


S. 


Q 


W 


J 


Y 


Q 


I 


B 


w 


R X 


































R 


E 






E 


E 


R 




0 
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22. Completing the solution. — a. Completion of solution is now a very easy matter. 
The mixed component is finally found to he the following sequence, based upon the word 



EXHAUSTING; 



EXHAUSTINGBCDFJKLMOPQRVWYZ 
and the completely reconstructed skeleton of the cipher square is shown in Fig. 136. 



Plain. 



Cipher...< 





A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


s 


T 


U 


V 


w 


X 


Y 


z 


(\ 


A 


U 


S 


T 


I 


N 


G 


B 


C 


D 


F 


J 


K 


L 


M 


0 


P 


Q 


R 


V 


w 


y 


z 


E 


X 


H 


2 


P 


Q 


R 


V 


W 


Y 


Z 


E 


X 


H 


A 


u 


S 


T 


I 


N 


G 


B 


C 


D 


F 


J 


K 


L 


M 


0 


ia 


R 


V 


W 


Y 


Z 


E 


X 


H 


A 


u 


s 


T 


I 


N 


G 


B 


C 


D 


F 


J 


K 


L 


M 


0 


P 


Q 


4 


I 


N 


G 


B 


C 


D 


F 


J 


K 


L 


M 


0 


P 


Q 


R 


V 


W 


Y 


Z 


E 


X 


H 


A 


U 


S 


T 


.5 


L 


M 


0 


P 


Q 


R 


V 


W 


Y 


Z 


E 


X 


H 


A 


U 


S 


T 


I 


N 


G 


B 


C 


D 


F 


J 


K 



PlQUBS 136. 



6. Note that the successive equivalents of Ap spell the word APRIL, which is the key for the 
message. The plain-text message is as follows: 

REPORTED ENEMY HAS RETIRED TO NEWCHESTER. ONE TROOP IS REPORTED AT HEN- 
DERSON MEETING HOUSE: TWO OTHER TROOPS IN ORCHARD AT SOUTHWEST EDGE OF NEW- 
CHESTER. 2D SQ IS PREPARING TO ATTACK FROM THE SOUTH. ONE TROOP OF 3D SQ IS 
ENGAGING HOSTILE TROOP AT NEWCHESTER. REST OF 3D SQ IS MOVING TO ATTACK 
NEWCHESTER FROM THE NORTH. MOVE YOUR SQ INTO WOODS EAST OF CROSSROAD 539 AND 
BE PREPARED TO SUPPORT ATTACK OF 2D AND 3D SQ . DO NOT ADVANCE BEYOND NEWCHESTER . 
MESSAGES HERE. 

TREER, 

COL. 

c. The preceding case is a good example of the value of the principles of direct symmetry 
of position when applied properly to a cryptogram enciphered by the sliding of a mixed com- 
ponent against the normal. The cryptanalyst starts off with only a very limited number of 
assumptions and builds up many new values as a result of the placement of the few original 
values in the reconstruction skeleton. 

23. Solution of subsequent messages enciphered by the same cipher component. — a. 
Preliminary remarks . — ^Let it be supposed that the correspondents are using the same basic or 
primary component but with different key words for other messages. Can the knowledge of 
the sequence of letters in the reconstructed primary component be used to solve the subsequent 
messages? It has been shown that in the case of a monoalphabetic cipher in which a mixed 
alphabet was used, the process of completing the plain component could be applied to solve 
subsequent messages in which the same cipher component was used, even though the cipher 
component was set at a different key letter. A modification of the procedure used in that case 
can be used in this case, where a plurality of cipher alphabets based upon a sliding primary 
component is used. 
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b. The message . — Let it be supposed that the following message passing between the same 
two correspondents as in the preceding message has been intercepted: 

Mebsaqb 



SFDZR 


YRRKX 


MIWLL 


AQRLU 


RQFRT 


IJQKF 


XUWBS 


MDJZK 


MICQC 


UDPTV 


TYRNH 


TRORV 


BQLTI 


QBNPR 


RTUHD 


PTIVE 


RMGQN 


LRATQ 


PLUKR 


KGRZF 


JCMGP 


IHSMR 


GQRFX 


BCABA 


OEMTL 


PCXJM 


RGQ,SZ 


VB 











c. Factoring and conversion into plain component equivalents. — The presence of a repetition 
of a four-letter polygraph whose interval is 21 letters suggests a key word of seven letters. There 
are very few other repetitions, and this is to be expected in a short message with a key of such 



7 

R 

L 

R 

Q 

S 

I 

T 

T 

L 

R 

T 

Q 

P 

R 

P 

Q 

B 

P 

Q 



length. 








1 


a a 


4 


a 


« 


S 


F D 


Z 


R 


y 


R 


K X 


M 


I 


w 


L 


A Q 


R 


L 


u 


Q 


F R 


T 


I 


J 


K 


F X 


U 


W 


B 


M 


D J 


Z 


K 


M 


C 


Q c 


U 


D 


P 


V 


T Y 


R 


N H 


R 


0 R 


V 


B Q 


T 


I Q 


B 


N 


P 


R 


T U 


H 


D 


P 


I 


V E 


R 


M 


G 


N 


L R 


A 


T Q 


L 


U K 


R 


K 


G 


Z 


F J 


C 


M 


G 


I 


H S 


M 


R 


G 


R 


F X 


B 


C 


A 


A 


0 E 


M 


T 


L 


C 


X J 


M 


R 


G 


S 


Z V 


B 







d. Transcription into periods. — Let the message 
be written in groups of seven letters, in colunmar 
fashion, as shown in Fig. 14. The letters in each 
column belong to a single alphabet. Let the letters 
in each column be converted into their plain-com- 
ponent equivalents by setting the reconstructed 
cipher component against the normal alphabet at any 
arbitrarily selected point, for example, that shown 
below: 



Fioubz 14. 



Plain.... 

Cipher. 



EXHAUSTINGBCDFJKL 



1 


2 


a 


4 5 8 


7 


F 


N 


M 


Z V Y 


V 
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The columns of equivalents are now as shown in Fig. 15. 

e. Examination and sdection of generatrices. — ^It has been shown that in the case of a mono- 
alphabetic cipher it was merely necessary to complete the normal alphabet sequence beneath 
the plain-component equivalents and the plain text all reappeared on one generatrix. It was 
also found that ia the case of a multiple-alphabet cipher involving standard alphabets, the plain- 
text equivalents of each alphabet reappeared on the same generatrix, and it was necessary only 
to combine the proper generatrices in order to produce the plain text of the message. In the 
case at hand both processes are combined: the normal alphabet sequence is continued beneath 
the letters of each column and then the generatrices are combined to produce the plain text. 
The completely developed generatrix diagrams for the first two colunms are as foUows (Fig. 16): 
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21 IKYIIHPBNCBRLZIXINWU 
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y. Combining the selected generatrices . — ^After some experi- 
menting with these generatrices the 23d generatrix of Column 1 and 
the 1st of Column 2, which yield the digraphs shown in Pig. 17o, 
are combined. The generatrices of the subsequent columns are 
examined to select those which may be added to these already 
selected in order to build up the plain text. The results are shown 
in Fig. 175. This process is a very valuable aid in the solution of 
messages after the primary component has been recovered as a 
result of the longer and more detailed analysis of the frequency 
distributions of the first message intercepted. Very often a short 
message can be solved in no other way than the one shown, 
if the primary component is completely known. 

g. Recovery oj the key . — ^It may be of interest to find the key 
word for the message. Assuming that enciphering method num- 
ber 1 (see Par. 7/, page 6) were known to be employed, all that 
is necessary is to set the mixed component of the cipher alphabet 
underneath the plain component so as to produce the cipher letter 
indicated as the equivalent of any given plain-text letter in each 
of the alphabets. For example, in the first alphabet it is noted that 
Cp=So. Adjust the two components under each other so as to 
bring S of the cipher component beneath C of the plain component, 
thus; 
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Plain ABCDEFGHIJKUmOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher. EXHAUSTINGBCDFJKLMOPQRVWYZ 

It is noted that Ap=A*. Hence, the first letter of the key word to the messf^e is A. The 2d, 
3d, 4th, ... 7th key letters are found in exactly the same manner, and the following is obtained: 

When C 0 F I R S T equals 

S F D Z R Y R then Ap successively equals 
AZIMUTH 

24. Summation of relative frequencies as an aid to the selection of the correct generatrices. — 

a. In the foregoing example, under subparagraph /, there occurs this phrase: “After some 
experimenting with these generatrices . . By this was meant, of course, that the selection of 
the correct initial pair of generatrices of plain-text equivalents is in this process a matter of trial 
and error. The test of “correctness” is whether, when juxtaposed, the two generatrices so 
selected yield "good” digraphs, that is, high-frequency digraphs such as occur in normal plain 
text. In his early efforts the student may have some difficulty in selecting, merely by ocular 
examination, the most likely generatrices to try. There may be in each diagram several gen- 
eratrices which contain good assortments of high-frequency letters, and the number of trials of 
combinations of generatrices may be quite large. Perhaps a simple mathematical method may 
be of assistance in the process. 

b. Suppose, in Fig. 16, that each letter were accompanied by a number which corresponds 
to its relative frequency in normal English telegraphic text. Then, by adding the numbers alot^ 
each horizontal line, the totals thus obtained will serve as relative numerical measures of the 
frequency values of the respective generatrices. Theoretically, the generatrix with the greatest 
value will be the correct generatrix because its total will represent the sum of the individual 
values of the actual plaintext letters. In actual practice, of course, the generatrix with the 
greatest value may not be the correct one, but the correct one will certainly be among the three 
or four generatrices with the largest values. Thus, the number of trials may be greatly reduced, 
in the attempt to put together the correct generatrices. 

c. Using the preceding message as an example, note the respective generatrix values in Fig. 
18. The frequency values of the respective letters shown in the figure are based upon the normal 
distribution for War Department telegraphic text (see Table 3, Appendix 1, Military Crypt- 
analysis, Part I). 
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T 


59 




3 


0 


0 


8 


8 3 


8 


7 


2 


1 


7 


0 


0 


3 


8 3 


3 3 


3 


3 


21 


I 


K 


Y 


I 


I H 


P 


B 


N 


C 


B 


R 


L 


Z 


I X 


I N 


W 


U 


81 




7 


0 


2 


7 


7 8 


3 


1 


8 


3 


1 


8 


4 


0 


7 0 


7 8 


3 


8 


22 


J 


L 


Z 


J 


J I 


Q 


C 


0 


D 


C 


S 


M 


A 


J Y 


J 0 


X 


V 


56 




0 


4 


0 


0 


0 7 


0 


8 


8 


4 


3 


8 


2 


7 


0 3 


0 8 


0 


3 


23 


K 


M 


A 


K 


K J 


R 


D 


P 


E 


D 


T 


N 


B 


K Z 


K P 


Y 


w 


66 




0 


2 


7 


0 


0 0 


8 


4 


3 


13 


4 


9 


8 


1 


0 0 


0 8 


3 


3 


24 


L 


N 


B 


L 


L K 


S 


E 


Q 


F 


E 


U 


0 


C 


L A 


L Q 


z 


X 


85 




4 


8 


1 


4 


4 0 


8 


13 


0 


8 


13 


3 


8 


3 


4 7 


4 0 


0 


0 


25 


H 


0 


C 


H M L 


T 


F 


R 


G 


F 


V 


P 


D 


H B 


M R 


A 


Y 


77 




3 


8 


8 


2 


3 4 


9 


8 


8 


3 


8 


3 


8 


4 


3 1 


3 8 


7 


2 



Tiavu 18 . 
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d. It will be noted that the frequency value of the 23d generatrix for the first column of 
cipher letters is the greatest; that of the first generatrix for the second column is the greatest. 
In both cases these are the correct generatrices. Thus the selection of the correct generatrices 
in such cases has been reduced to a purely mathematical basis which is at times of much assistance 
in effecting a quick solulion. Moreover, an understanding of the principles involved will be of 
considerable value in subsequent work. 

25. Solution by the probable-word method. — a. Occasionally one may encounter a crypto- 
gram which is so short that it contains no recurrences even of digraphs, and thus gives no indi- 
cations of the number of alphabets involved. If the sliding mixed component is known, one may 
apply the method illustrated m Par. 15, assuming the presence of a probable word, checking it 
against the text and the sliding components to establi^ a key, if the correspondents are using 
key words. 

h. For example, suppose that the presence of the word ENEMY is assumed in the message 
in Par. 236 above. One proceeds to check it against an u nkn own key word, sliding the already 
reconstructed mixed component against the normal and starting with the first letter of the 
cryptogram, in this manner: 

When ENEMY equals 

SFDZR then Ap successively equals 
XENFW 

The sequence XENFW spells no intelligible word. Therefore, the location of the assumed word 
ENEMY is shifted one letter forward in the cipher text, and the test is made again, just as was 
explained in Par. 15. When the group AQRLU is tried, the key letters ZIMUT are obtained, 
which, taken as a part of a word, suggests the word AZIMUTH. The method must yield solution 
when the correct assumptions are made. 

c. The danger to cryptographic security resulting from the inclusion of cryptographed 
addresses and signatures in cryptographic messages becomes quite obvious in the light of 
solution by the probable-word method. To illustrate, reference is made to the message employed 
in Pars. 19-22. It will be noted in Par. 226 that the message carried a signature (Treer, Col.) 
and that the latter was enciphered. Suppose that this were an authorized practice, and that 
every message could be assumed to conclude with a cryptographed signature. The signature 
“TREER COL” would at once afford a very good basis for the quick solution of subsequent mes- 
sages emanating from the same headquarters as did the first message, because presumably this 
same signature would appear in other messages. It is for this reason that addresses and signa- 
tures must not be cryptographed; if they must be included they riiould be cryptographed in a 
totally different system or by a wholly different method, perhaps by means of a special address 
and signature code. It would be best, however, to omit all addresses and signatures, and to 
let the call signs of the headquarters concerned also convey these parts of the message, leaving 
the delivery to the addressee a matter for local action. 

26. Solution when the plain component is a mixed sequence, the cipher component, the 
normal, — a. This falls under Case B (2) outlined in Par. 6. It is not the usual method of 
employing a single mixed component, but may be encountered occasionally in cipher devices. 

6. The preliminary steps, as regards factoring to determine the length of the period, are 
the same as usual. The message is then transcribed into its periods. Frequency distributions 
are then made, as usual, and these are attacked by the principles of frequency and recurrence. 
An attempt is made to apply the principles of direct symmetry of position, but this attempt 
will be futile, for the reason that the plain component is in this case an vmknovm mixed sequence. 
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(See Far. 18(Z.) Any attempt to find symmetry in the secondary alphabets based upon the normal 
sequence can therefore disclose no symmetry because the symmetry which exists is based upon a 
wholly different sequence. 

c. However, if the principles of direct S 3 niunetry of position are of no avail in this case, 
there are certain other principles of symmetry which may be employed to great advantage. 
To explain them an actual example will be used. Let it be assumed that it is known to the 
cryptanalyst that the enemy is using the general system under discussion, viz, a mixed sequence, 
variable from day to day, is used as plain component; the normal sequence is used as cipher 
component; and a repeating key, variable from message to message, is used in the ordinary 
manner. 

The following message has been intercepted: 



1 



2 



s 



4 



5 



e 



A. Q E 0 V K 

B. V R Z M 0 

C. U Q A X R 

D. U L I B K 

E. L A D H Y 

F. S P B H B 

G. H S A H Y 

H. N X A L B 

J. S H M M E 

K. N Q G U Y 

L. J U U G B 

M. V X N W A 

N. P K M B X 

P. F L N U J 

Q. G P G T Y 

R. J K A T E 

S. T M U L Z 

T. X C G Z A 
V. S R Q Z L 



L R M L Z 
Y A A M P 
H U F B U 
N D A X B 
B V N F V 
X V A Z C 
T M G U J 
T C D L M 
G Q D H 0 
J I W Y Y 
J H R V X 
F A A N E 
H G E R Y 
N D T V X 
T E C X B 
G U W B R 
L A A H Y 
H D G T L 
A V N H L 



J V G T G 
D K E I J 
K Q Y M U 
X U D G L 
U E E M E 
U D Y U E 
H Q X P P 
I V A A A 

Y H I V P 
T M A H W 
E R F L E 
M K G H B 
T M W L Z 
J R Z T L 
H Q E B R 
H U Q W M 
J G D V K 

V K M B W 
G V W V K 



N D L V K 
S F M y 0 
N E L V T 
L A D V K 
F F M T E 
L K M M A 
D K 0 U E 
N S Z 1 L 
N G R R E 
X R L B L 
G W G U 0 
S S N L 0 
N Q C Y Y 

0 P A H C 
K V W M U 
V R Q B W 
L K R R E 

1 S A U E 
F I G H P 



E V N T Y 

Y H M M E 
K Q I L E 
P 0 A Y 0 
G V W B Y 
E U D D K 
X U Q V B 
0 V W V P 
X K D Q Z 
0 A D L G 
X E D T P 
K J C B Z 
T M W I P 
D F Z Y Y 
N I N G J 

Y R F B F 
X K N A 0 
F D N W P 
G E C Z U 



E R M U E 
G Q A M B 
K Z B U E 
D K K Y K 
T V D Z L 
N C F S H 
F V W B X 
Y A G Z L 
G K N C G 
N Q G U Y 
D K E I Z 
T G G L 0 
D K A T E 
D E y C L 
I Q D L P 
K M W M B 
N D S B X 
N L Z I J 
K Q A P 



d. A study of the recurrences and factoring their intervals discloses that five alphabets are 
involved. Uniliteral frequency distributions are made and are shown in Fig. 19a: 



Alphabet 1 




ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Alphabet 2 



ABCDEFGHIJKLMNOPQRST 



U 



g - = 

V W X Y Z 
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Alphabet 3 



ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Alphabet 4 



ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Alphabet 5 



ABCDEFGHIJKLMNOPQRSTUVWXYZ 

FiavBS 19a. 

e. Since the cipher component in this case is the normal alphabet, it follows that the Jive 
fregueTicy distributions are based upon a sequence which is known, and therefore, the fixe frequency 
distributions should manifest a direct symmetry of distribution of crests and troughs. By virtue of 
t.Tiia symmetry and by shifting the five distributions relative to one another to proper superim- 
positions, the several distributions may be combined into a single uniliteral distribution. Note 
how this sTiifring has been done in the case of the five illustrative distributions: 

Alphabet 1 



ABCDEFGHIJ KLMNOPQRSTUVWXYZ 

Alphabet 2 



XYZABCDEFGHIJKLMNOPQRSTUVW 

Alphabet 3 



TUVWXYZABCDEFGHIJKLMNOPQRS 

Alphabet 4 



OPQRSTUVWXYZABCDEFGHIJKLMN 

Alphabet 5 



RSTUVWXYZABCDEFGHIJKLMNOPQ 



PlQCIlC 199. 



152018—88 i 
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/. The superimpoaition of the respective distributions enables one to convert the cipher 
. I letters of the five alphabets into one alphabet. Suppose it is decided to convert Alphabets 

'i 2, 3, 4, and 5 into Alphabet 1. It is merely necessary to substitute for the respective letters in 

-\ the four alphabets those which stand above them in Alphabet 1. For example, in Fig. 196, X, 

in Alphabet 2 is directly imder Ag in Alphabet 1 ; hence, if the supeiimposition is correct then 

' a, ' 1 

i Xg“Ag* Therefore, in the cryptogram it is merely necessary to replace every Xg in the second 

position by Ag. Again Tg in Alphabet 3=Ag in Alphabet 1; therefore, in the cryptogram one 
replaces every T, in the third position by Ag. The entire process, hereinafter designated as 
conversion into monoalphabetic terms, gives the following converted message: 





1 




2 


8 


4 


8 


« 


A. 


Q H V H T 


L 


U T X I 


J Y N F P 


N G S H T 


E Y U F H 


E U T G N 


B. 


V U G Y X 


Y 


D H Y Y 


D N L U S 


S I T K X 


Y K T Y N 


G T H Y K 


C. 


U T H J A 


H 


X M N D 


K T F Y D 


N H S H C 


K T P X N 


K C I G N 


D. 


U 0 P N T 


N 


G H J K 


X X K S U 


L D K H T 


P R H K X 


D N R K T 


E. 


L D K T H 


B 


Y U R E 


U H L Y N 


F I T F N 


G Y D N H 


T Y K L U 


F. 


S S I T K 


X 


Y H L L 


U G F G N 


L N T Y J 


E X K P T 


N F M E Q 


G. 


H V H T H 


T 


PNGS 


H T E B Y 


D N V G N 


X X X H K 


F Y D N G 


H. 


N A H X K 


T 


F K X V 


I Y H M J 


N V G U U 


0 Y D H Y 


Y D N L U 


J. 


S K T Y N 


G 


T K T X 


Y K P H Y 


N F Y D N 


X N K C I 


G N U 0 P 


K. 


N T N G H 


J 


L D K H 


T P H T F 


X U S N U 


0 D K X P 


N T N G H 


L. 


J X B S K 


J 


K Y H G 


E U M X N 


G Z N G X 


X H K F Y 


D N L U I 


M. 


V A U I J 


F 


D H Z N 


M N N T K 


S V U X X 


K M J N I 


T J N X X 


N. 


P N T N G 


H J L D H 


T P D X I 


N T J K H 


T P D U Y 


D N H F N 


P. 


F 0 U G S 


N G A H G 


J U G F U 


0 S H T L 


D I G K H 


D H F 0 U 


Q. 


G S N F H 


T H J J K 


H T L N A 


K Y D Y D 


N L U S S 


I T K X Y 


R. 


J N H F N 


G X D N A 


H X X I V 


V U X N F 


Y U M N 0 


K P D Y K 


S. 


T P B X I 


L 


D H T H 


J J K H T 


L N Y D N 


X N U M X 


N G Z N 6 


T. 


X F N L J 


H 


G N F U 


V N T N F 


I V H G N 


F G U I Y 


N 0 G U S 


V. 


S U X L U 


A Y U T U 


G Y D H T 


F L N T Y 


G H J L D 


K T H B 



The uniliteral frequency distribution for this converted text follows. Note that the frequency 
of each letter is the sum of the five frequencies in the corresponding columns of Fig. 196. 





ABC DEFGHIJKLMNOPQRSTUVWXYZ 



7S3U7108aS4 18 2X 38 23 8et«14 2 8 18 483SI2 3C38S 



Fioubi 20. 
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g. The problem having been reduced to monoalphabetic terms, a triliteral frequency distri- 
bution can now be made and solution readily attained by simple principles. It yields the 
following: 

JAPAN CONSULTED GERMANY TODAY ON REPORTS THAT THE COMMUNIST INTERNATIONAL 
WAS BEHIND THE AMAZING SEIZURE OF GENERALISSIMO CHIANG KAI'SHEK IN CHINA. 
TOKYO ACTED UNDER THE ANTICOMMUNIST ACCORD RECENTLY SIGNED BY JAPAN AND GER- 
MANY. THE PRESS SAID THERE WAS INDISPUTABLE PROOF THAT THE COMINTERN INSTI- 
GATED THE SEIZURE OF GENERAL CHIANG AND SOME OF HIS GENERALS. MILITARY OB- 
SERVERS SAID THE COUP WOULD HAVE BEEN IMPOSSIBLE UNLESS GENERAL CHANG HSUEN 
LIANG HOTHEADED FORMER WAR LORD OF MANCHURIA HAD FORMED AN ALLIANCE WITH THE 
COMMUNIST LEADERS HE WAS SUPPOSED TO BE FIGHTING. SUCH AN ALLIANCE THESE 
OBSERVERS DECLARED OPENED UP A RED ROUTE FROM MOSCOW TO NORTH AND CENTRAL 
CHINA. 

h. The reconstruction of the plain component is now a very simple matter. It is found to 
be as follows: 

HYDRAULICBEFGJKMNOPQSTVWXZ 

Note also, in Fig. 196, the keyword for the message, (HEAVY), the letters being in the columns 
headed by the letter H. 

i. The solution of subsequent messages with different keys can now be reached directly, by 
a simple modification of the principles explained in Far. 18. This modification consists in using 
for the completion sequence the mixed plain component (now known) instead of the normal alpha- 
bet, after the cipher letters have been converted into their plain-component equivalents. Let 
the student confirm this by experiment. 

j. The probable-word method of solution discussed under Paragraph 20 is also applicable 
here, in case of very short cryptograms. This method presupposes of course, possession of the 
mixed component and the procedure is essentially the same as that in Far. 20. In the example 
discussed in the present paragraph, the letter A on the plain component was successively set 
against the key letters HEAVY ; but this is not the only possible procedure. 

k. The student should go over carefully the principle of “conversion into monoalphabetic 
terms” explained in subparagraph/ above until he thoroughly understands it. Later on he will 
encounter cases in which this principle is of very great assistance in the cryptanalysis of more 
complex problems. (Another example will be foimd under Far. 45.) 

l. The principle illustrated in subparagraph e, that is, shifting two or more monoalphabetic 
frequency distributions relatively so as to bring them into proper alignment for amalgamation 
into a single monoalphabetic distribution, is called rmtcKing. It is a very important crypt- 
analytic principle. Note that its practical application consists in sliding one monoalphabetic 
distribution against the other so as to obtain the best coincidence between the entire sequence 
of crests and troughs of one distribution and the entire sequence of crests and troughs of the other 
distribution. When the best point of coincidence has been found, the two sequences may be 
amalgamated and theoretically the single resultant distribution will also be monoalphabetic in 
character. The successful application of the principle of matching depends upon several factors. 
First, the cryptographic situation must be such that matching is a correct cryptographic step. 
For example, the distributions in figure 196 are properly subject to matching because the cipher 
component in the basic sequences concerned in this problem is the normal sequence, while the 
plain component is a mixed sequence. But it would be futile to try to match the distributions 
in figure 9, for in that case the cipher component is a mixed sequence, the plain component is 
the normal sequence. Hence, no amount of shifting or matching can bring the distributions of 
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figure 9 into proper superimpoaition for correct amalgamation. (If the occurrences in the various 
distributions in figure 9 had been distributed according to tbe sequence of letters in the mixed 
component, then matching would be possible; but in order to be able to distribute these occmv 
rences according to the mixed component, the latter has to be known — and that is just what is 
unknown until the problem has been solved.) A second factor involved in successful matching 
is the number of elements in the two distributions forirung the subject of the test. If both 
of them have very few tallies, there is hardly sufficient information to permit of matching with 
any degree of assurance that the work is not in vain. If one of them has many tallies, the other 
only a few, the chances for success are better than before, because the positions of the blanks in 
the two distnbutions can be used as a guide for their proper superimposition. 

m. There are certain mathematical and statistical procedures which can be brought to bear 
upon the matter of ciyptanalytic matching. These will be presented in a later text. However, 
until the student has studied these mathematical and statistical methods of matching distri- 
butions, he will have to rely upon mere ocular examination as a guide to proper superimposition. 
Obviously, the more data he has in each distribution, the easier is the correct superimposition 
ascertained by any method. 




KEF ID:A64646 



Section VI 

BEPEATING-EEY SYSTEMS WITH MIXED CIPHER ALPHABETS, H 

Fsragnph 

Further cases to be considered 27 

Identical primary mixed components proceeding in the same direction 28 

Cryptographing and deoryptographing by means of identical primary mixed components 29 

Principles of solution 80 

27. Further cases to be considered. — a. Thus far Cases B (1) and (2), mentioned in Para- 
graph 6 have been treated. There remains Case B (3), and this case has been further subdivided 
as follows: 

Case B (3). Both components are mixed sequences. 

(a) Components are identical mixed sequences. 

(1) Sequences proceed in the same direction. (The secondary alphabet are mixed 

alphabets.) 

(2) Sequences proceed in opposite directions. (The secondary alphabets are 

reciprocal mixed alphabets.) 

(b) Components are different mixed sequences. (The secondary alphabets are mixed 

alphabets.) 

h. The first of the foregoing subcases will now be examined. 

28. Identical primary mixed components proceeding in the same direction, — a. It is often 
the case that the mixed components are derived from an easily remembered word or phrase, 
so that they can be reproduced at any time from memory. Thus, for example, given the key 
word QUESTIONABLY, the following mixed sequence is derived: 

QUESTIONABLYCDFGHJKMPRVWXZ 

h. By using this sequence as both plain and cipher component, that is, by sliding this 
sequence against itself, a series of 26 secondary mixed alphabets may be produced. In encipher- 
ing a message, sliding strips may he employed with a key word to designate the particular and 
successive positions in which the strips are to be set, the same as was the case in previous examples 
of the use of sliding components. The method of designating the positions, however, requires 
a word or two of comment at this point. In the examples thus far shown, the key letter, as 
located on the cipher component, was always set opposite A, as located on the plain component; 
possibly an erroneous impression has been created, viz, that this is invariably the rule. This 
is decidedly not true, as has already been explained in paragraph 7c. If it has seemed to be the 
case that 0^ always equals Ap, it is only because the text has dealt thus far principally with cases in 
which the plain component is the normal sequence and its intital letter, which usually consti- 
tutes the index for juxtaposing cipher components, is A. It must be emphasized, however, 
that various conventions may be adopted in this respect; but the most common of them is to 
employ the initial letter of the plain component as the index letter. That is, the index letter, 
01, will be the initial letter of the mixed sequence, in this case, Q. Furthermore, to prevent the 
possibility of ambiguity it will be stated again that the pair of enciphering equations employed 
in the ensuing discussion will be the first of the 12 set forth under Par. 7/, viz, 0k/2=0i/i; 0p/i=0c/2* 
In this case the subscript “1” means the plain component, the subscript “2”, the cipher 
component, so that the enciphering equation is the following; 0k/«=^Oi/p; 0p/p — 0«/o* 

( 49 ) 
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e. By setting the two sliding components against each other in the two positions shown 
below, the cipher alphabets labeled (1) and (2) given by two key letters, A and B, are seen to be 
different. 

Key Lettbb=A 0j 

ir 

Plain component.. QUESTIONABLYCDFGHJKMPRVWXZQUESTIONABLYCDFGHJKMPRVWXZ 

Cipher component. QUESTIONABLYCDFGHJKMPRVWXZ 

t 

0k 

Secondary alphabet (1); 

Plain ABCDEFGHIJKLMNOPQRSTUVffXYZ 

Cipher HJPRLVWXDZQKUGFEASYCBTIOMN 

Key Letteb=B ©i 

Plain component QUESTIONABLYCDFGHJKMPRVUilXZQUESTIONABLYCDFGHJKMPRVWXZ 

Cipher component QUESTIONABLYCDFGHJKMPRVWXZ 

t 

Secondary alphabet (2): 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher. JKRVYWXZFQUMEHGSBTCDLIONPA 

d. Very frequently a quadricular or square table is employed by the correspondents, instead 
of sliding strips, but the results are the same. The cipher square based upon the word QUESTION- 
ABLY is shown in Fig. 21. It will be noted that it does nothing more than set forth the successive 
positions of the two primary sliding components; the top line of the square is the plain component, 
the successive horizontal lines below it, the cipher component in its various juxtapositions. The 
usual method of employing such a square (i. e., corresponding to the enciphering equations 
©k/e=©i/p; 0p/p=©o/o) is to take as the cipher equivalent of a plain-text letter that letter which 
lies at the intersection of the vertical colunm headed by the plain-text letter and the horizontal 
row begun by the key letter. For example, the cipher equivalent of Ep with keyletter T is the 
letter 0^; or Ep (Tn)=0e. The method given in paragraph b, for determioing the cipher equiva- 
lents by means of the two sliding strips yields the same results as does the cipher square. 
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QUESTIONABLYCDFGHJKMPRVWXZ 

UESTIONABLYCDFGHJKMPRVWXZQ 

ESTIONABLYCDFGHJKMPRVWXZQU 

STIONABLYCDFGHJKMPRVWXZQUE 

TIONABLYCDFGHJKMPRVWXZQUES 

lONABLYCDFGHJKMPRVWXZQUEST 

ONABLYCDFGHJKMPRVffXZQUESTI 

NABLYCDFGHJKMPRVWXZQUESTIO 

ABLYCDFGHJKMPRVWXZQUESTION 

BLYGDFGHJKMPRVWXZQUESTIONA 

LYCDFGHJKMPRVWXZQUESTIONAB 

YCDFGHJKMPRVWXZQUESTIONABL 

CDFGHJKMPRVWXZQUESTIONABLY 

DFGHJKMPRVWXZQUESTIONABLYC 

FGHJKMPRVWXZQUESTIONABLYCD 

GHJKMPRVWXZQUESTIONABLYCDF 

HJKMPRVWXZQUESTIONABLYCDFG 

JKMPRVVXZQUESTIONABLYCDFGH 

KMPRVWXZQUESTIONABLYCDFGHJ 

MPRVWXZQUESTIONABLYCDFGHJK 

PRVWXZQUESTIONABLYCDFGHJKM 

RVWXZQUESTIONABLYCDFGHJKMP 

VWXZQUESTIONABLYCDFGHJKMPR 

WXZQUESTIONABLYCDFGHJKMPRV 

XZQUESTIONABLYCDFGHJKMPRVW 

ZQUESTIONABLYCDFGHJKMPRVWX 

FlomB 31. 

29. Cryptographing and decryptographing by identical primary mixed components. — There 
is nothing of special interest to be noted in connection with the use either of identical mixed 
components or of an equivalent quadiicular table such as that shown in Fig. 21, in enciphering or 
deciphering a message. The basic principles are the same as in the case of the sUding of one 
mixed component against the normal, the displacements of the two components being controlled 
by changeable key words of varying lengths. The components may be changed at will and so on. 
All this has been demonstrated adequately enough in Elementary Military Cryptography, and 
Advanced Military Cryptography. 

30. Principles of solution. — a. Basically the principles of solution in the case of a crypto- 
gram enciphered by two identical mixed sliding components are the same as in the preceding 
case. Primary recourse is had to the principles of frequency and repetition of single letters, 
digraphs, trigraphs, and polygraphs. Once an entering wedge has been forced into the problem, 
the subsequent steps may consist merely in continuing along the same lines as before, building 
up the solution bit by bit. 

b. Doubtless the question has already arisen in the student’s mind as to whether any 
principles of symmetery of position can be used to assist in the solution and in the reconstruction 
of the cipher alphabets in cases of the kind imder consideration. This phase of the subject will 
be taken up in the next section and will be treated in a somewhat detailed manner, because the 
theory and principles involved are of very wide application in cryptwalytics. 
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Section VII 

THEORY OF INDIRECT SYMMETRY OF POSITION IN SECONDARY ALPHABETS 

Paragraph 

Reconstruction of primary components from secondary alpliabets - 31 

31, Reconstruction of primary components from secondary alphabets. — a. Note the two 
secondary alphabets (1) and (2) given in paragraph 28c. Externally they show no resemblance 
or symmetry despite the fact that they were produced from the same primary components. 
Nevertheless, when the matter is studied with care, a symmetry of position is discoverable. 
Because it is a hidden or latent phenomenon, it may be termed latent symmetry of position. 
However, in previous texts the phenomenon has been designated as an indirect symmetry of position 
and this terminology has grown into usage, so that a change is perhaps now inadvisable. 
Indirect symmetry of position is a very interesting and exceedingly useful phenomenon in 
cryptanalytics. 

b. Consider the following secondary alphabet (the one labeled (2) in paragraph 28c): 

. . fPlain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

^ ^ (Cipher JKRVYWXZPQUMEHGSBTCDLIONPA 

c. Assuming it to be known that this is a secondary alphabet produced by two primary 
identical mixed components, it is desired to reconstruct the latter. Construct a chain of alter- 
nating plain-text and cipher-text equivalents, beginning at any point and continuing until the 
chain has been completed. Thus, for example, beginning with Ap= Jo, Jp=Qc, Qp=Bc, . . ., and 
dropping out the letters common to successive pairs, there results the sequence A J Q B . . .. By 
completing the chain the following sequence of letters is established: 

AJQBKULMEYPSCRTDVIFWOGXNHZ 

d. This sequence consists of 26 letters. When slid against itself it wiU produce exactly the 
same secondary alphabets as do the primary components based upon the word QUESTIONABLY. 
To demonstrate that this is the case, compare the secondary alphabets given by the two settings 
of the externally different components shown below; 

Plain component QUESTIONABLYCDFGHJKMPRVWXZQUESTIONABLYCDFGHJKMPRVWXZ 

Cipher component QUESTIONABLYCDFGHJKMPRVWXZ 

Secondary alphabet (1): 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher. JKRVYWXZPQUMEHGSBTCDLIONPA 

Plain component AJQBKULMEYPSCRTDVIFWOGXNHZAJQBKULMEYPSCRTDVIFWOGXNHZ 

Cipher component. AJQBKULMEYPSCRTDVIFWOGXNHZ 

Secondary alphabet (2): 

Plain... ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher JKRVYWXZFQUMEHGSBTCDLIONPA 

(52) 
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e. Since the sequence A J Q B K . . . gives exactly the same equivalents in the secondary 
alphabets as the sequence QUEST. . . gives, the fonner sequence is cryptographically 
equivalent to the latter sequence. For this reason the A J Q B K . . . sequence is termed 
an equivalent primary component} If the real or original primary component is a key-word mixed 
sequence, it is hidden or latent within the equivalent primary sequence; but it can be made patent 
by decimation of the equivalent primary component. The procedure is as follows: Find three 
letters in the equivalent primary component such as are likely to have formed an unbroken 
sequence in the original primary component, and see if the interval between the first and second 
is the same as that between the second and third. Such a case is presented by the letters W, X, 
and Z in the equivalent primary component above. Note ihe sequence . . . WOGXNHZ. . . ; 
the distance or interval between the letters W, X, and Z is two letters. Continuing the chain by 
adding letters two intervals removed, the latent original primary component is made patent. 
Thus: 



1 as 4 S « 7 8 8 10 11 12 13 14 15 IS 17 IS 19 20 21 22 23 84 25 as 

WXZQUESTIONABLYCDFGHJKMPRV 

y. It is possible to perform the steps given in c and c in a combined single operation when the 
original primary component is a key-word mixed sequence. Starting with any pair of letters (in 
the cipher component of the secondary alphabet) likely to be sequent in the key-word mixed 
sequence, such as JK* in the secondary alphabet labeled (2), the following chain of digraphs may 
be set up. Thus, J,K,in the plain component stand over Q.U, respectively, in the cipher com- 
ponent ; Q , U , in the plain component stand over B , L , respectively, in the cipher component, and 
so on. Connecting the pairs in a series, the following results are obtained: 

JK -» QU BL KM ^ UE -> LY MP ^ ES ^ YC PR ST CD -» RV -> 

TI DF -4 VW 10 ^ FG -» WX -> ON -> GH ^ XZ ^ NA ^ HJ ZQ ^ AB JK . . . 

These may now be united by means of their common letters: 

JK KM MP -» PR -» RV -» etc.=J KMPRVWXZQUESTIONABLYCDFGH 

The original primary component is thus completely reconstructed. 

g. Not all of the 26 secondary alphabets of the series yielded by two sliding primary compo- 
nents may be used to develop a complete equivalent primary component. If examination be made, 
it will be found that only 13 of these secondary alphabets will yield complete equivalent primary 
components when the method of reconstruction shown in subparagraph c above is followed. For 
example the following secondary alphabet, which is also derived, from the primary components 
based upon the word QUESTIONABLY will not yield a complete chain of 26 plain text-cipher- 
plain text equivalents: 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher CDHJOKMPBRVFWYLXTZNAIQUEGS 



‘ Such an equivalent component is merely a sequence which has been or can be developed or derived from 
the original sequence or basic primary component by applying a decimation process to the latter; conversely, 
the original or basic component can be derived from an equivalent component by applying the same sort of 
process to the equivalent component. By decimation is meant the selection of elements from a sequence accord- 
ing to some fixed interval. For example, the sequence A E I M . . .is derived, by decimation, from the 
normal alphabet by selecting every fourth letter. 
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Equivalent primary component: 

13846t788 10 11 12 1S 



ACHPXEOLFKVQT 



1 2 S 

A C H . . . (The A C H sequence begins again.) 



h. It is seen that only 13 letters of the chain have been established before the sequence begins 
to repeat itself. It is evident that exactly one-half of the chain has been established. The other 
half may be established by beginning with a letter not in the first half. Thus: 

123488789 10 11 12 13 |l 28 



BDJRZSNYGMWUI 



B D J 



(The B D J sequence begins again.) 



i. It is now necessary to distribute the letters of each half-sequence within 26 spaces, to 
correspond with their placements in a complete alphabet. This can only be done by allowing a 
constant odd number of spaces between the letters of one of the half-sequences. Distributions 
are therefore made upon the basis of 3, 5, 7, 9, . . . spaces. Select that distribution which 
most nearly coincides with the distribution to be expected in a key-word component. Thus, for 
example, with the first half-sequence the distribution selected is the one made by leaving three 
spaces between the letters. It is as follows: 

1 2 3 4 8 6 7 8 910 11 1213 14 18 16 17 18 19 20 21 2223242828 

a-l-c-f-h-k-p-v-x-q-e-t-o- 



j. Now interpolate, by the same constant interval (three in this case), the letters of the other 
half-sequence. Noting that the group F - H appears in the foregoing distribution, it is apparent 
that G of the second half-sequence should be inserted between F and H. The letter which imme- 
diately follows G in the second half-sequence, vis, M, is next inserted in the position three spaces to 
the right of G, and so on, until the inteipolation has been completed. This yields the original 
primary component, which is as follows: 

1 2 8 4 8 6 7 8 9 10 11 12 13 14 18 18 17 18 19 20 21 22 23 24 28 26 

ABLYCDFGHJKMPRVffXZQUESTION 



k. Another method of handling cases such as the foregoing is indicated in subparagraph / 
By extending the principles set forth in that subparagraph, one may reconstruct the following 
chain of 13 pairs from the secondary alphabet given in subparagraph g: 

1 2 8 4 8 8 7 8 9 10 lllOUll 

CD ^ HJ ^ PR -♦ XZ ES -» ON ^ LY -» FG -> KM -4 VW ^ QU TI AB )-» CD . . . 

Now find, in the foregoing chain, two pairs likely to be sequent, for example HJ and KM and count 
the interval between them in the chain. It is 7 (counting by pairs). If this decimation interval 
is now applied to the chain of pairs, the following is established: 

1 3 3 4 8 6 7 8 9 10 11 12 13 14 IS 18 17 18 19 30 21 33 23 34 28 26 

HJKMPRVWXZQUESTIONABLYCDFG 



1. The reason why a complete chain of 26 letters cannot be constructed from the secondary 
alphabet given under subparagraph g is that it represents a case in which two primary com- 
ponents of 26 letters were slid an even number of intervals apart. (This will be explained in 
further detail in subparagraph r below.) There are in all 12 such cases, none of which will 
admit of the construction of a complete chain of 26 letters. In addition, there is one case where- 
in, despite the fact that the primary components are an odd number of intervals apart, the 
secondary alphabet carmot be made to yield a complete chain of 26 letters for an equivalent 
primary component. This is the case in which the displacement is 13 intervals. Note the 
secondary alphabet based upon the primary components below (which are the same as those 
shown in subparagraph d): 






REF ID:A64646 



Fbimabt Coufonents 

QUESTIONABLYCDFGHJKMPRVWXZ 

DFGHJKMPRVWX2QUESTI0NABLYC 

Secondabt Alphabet 

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

Cipher- RVZQGUESKTIWOPMNDAHJFBLYXC 

m. If an attempt is made to construct a chain of letters from this secondary alphabet alone, 
no progress can be made because tlie alphabet is completely reciprocal. However, the crypt' 
analyst need not at all be baffled by this case. The attack will follow along the lines shown below 
in subparagraphs n and o. 

n. If the original primary component is a key-word mixed sequence, the cryptanalyst may 
reconstruct it by attempting to “dovetail” the 13 reciprocal pairs (AR,BV,CZ,DQ,EG,FU,HS, 
IK, JT, LW.MO, NP, and XY) into one sequence. The members of these pairs are all 13 intervals 
apart. Thus: 

0iaS46«78tlOUU13 

A R 



FiQUSK 22. 



Write out the series of numbers from 1 to 26 and insert as many pairs into position as possible, 
being guided by considerations of probable partial sequences in the key-word mixed sequence. 
Thus: 

0 1 2 S 4 8 6 7 8 8 10 11 12 11 14 IS 15 

ABCD RVZQ 

It begins to look as though the key-word commences with the letter Q, in which case it should 
be followed by U. This means that the next pair to be inserted is FU. Thus: 

0 1 2 8 4 5 9 7 8 9 10 11 18 IS 14 IS 18 17 

ABCDF RVZQU 

The sequence ABCDF means that E is in the key. Perhaps the sequence isABCDFGH. 
Upon trial, using the pairs EG and HS, the following placements are obtained: 

0 1 2 S 4 8 6 7 8 9 10 U 12 18 14 15 IS 17 18 10 

ABCDFGH RVZQUES 

This suggests the word QUEST or QUESTION. The pair JT is added: 

0 1 2 8 4 S 8 7 8 9 10 U 12 13 14 15 18 17 18 10 » 

ABCDFGHd RVZQUEST 



REF ID : A64646 



56 

The sequence G H J su^ests G H J K, which places an I after T. Enough of the process has 
been shown to make the steps clear. 

0 . Another method of circumventing the difficulties introduced by the 14th secondary 
alphabet (displacement interval, 13) is to use it in conjunction with another secondary alphabet 
which is produced by an even-interval displacement. For example, suppose the following two 
secondary alphabets are available.* 

0. ABCDEFGHIJKLMNOPQRSTUVffXYZ 

1 RVZQGUESKTIWOPMNDAHJFBLYXC 

2 XZESKTIORNAQBWVLHYMPJCDFUG 

7I017RK 23. 

The first of these secondaries is the 13-interval secondary; the second is one of the even- 
interval secondaries, from which only half-chain sequences can be constructed. But if the con- 
struction be based upon the two sequences, 1 and 2 in the foregoing diagram, the following is 
obtained: 

RXUTNLDHMVZEIAYFJPffQSOBCGK 

This is a complete equivalent primary component. The original key-word mixed component 
can be recovered from it by decimation based upon the 9th interval; 

RVWXZQUESTIONABLYCDFGHJKMP 

p. (1) When the primary components are identical mixed sequences proceeding in opposiie 
directions, all the secondary alphabets will be reciprocal alphabets. Reconstruction of the 
primary component can be accomplished by the procedure, indicated under subparagraph o 
above. Note the following three reciprocal secondary alphabets: 

1 2 3 4 8 6 7 8 9 10 11 12 13 14 18 16 17 18 16 20 21 22 23 24 28 26 

0. ... ABCDEFGHIJKLMNOPQRSTUVWXYZ 

1. „. PMHGQFDCWYLKBRVAENZXUOITJS 
2 .... WVMKSJHGQFDRCXZYILEUTBANPO 
3-.. TSSZLXWVNRPEMIOKCJBAYHGFUD 

FlOUBI 24. 

(2) Using lines 1 and 2, the following chain can be constructed (equivalent primary com- 
ponent): 

PWQSOBCGKRXUTNLD. HM7ZEIATFJ 

* The method of writing down the secondaries shown in figure 23 will hereafter be followed in all cases when 
alphabet reconstruction skeletons are necessary. The top line will be understood to be the plain component; it 
is common to all the secondary alphabets, and is set off from the cipher components by the heavy black line. 
This top line of letters will be designated by the digit 0 , and will be referred to as “the zero line” in the diagram. 
The successive lines of letters, which occupy the space below the zero line and which contain the various cipher 
components of the several secondary alphabets, will be numbered serially. These numbers may then be used as 
reference numbers for designating the horizontal lines in the diagram. The numbers standing above the letters 
may be used as reference numbers for the vertical columns in the diagram. Hence, any letter in the reconstruc- 
tion skeleton may be designated by coordinates, giving the horizontal or X coordinate first. Thus, 0 (2-11) 
means the letter D standing in line 2, Column 11. 
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Or, using lines 2 and 8: 

WTYKZODPUAGVSLJXICMQNFREBH 

The original key-word mixed primary component (based on the word QUESTIONABLY) can 
be recovered from either of the two foregoing equivalent primary components. But if lines 1 
and 3 are used, only half-chains can be constructed: 

PTFXAKECVOHQL and MSDWNJUYRIGZB 

This is because 1 and 3 are both odd-interval secondary alphabets, whereas 2 is an even- 
interval secondary. It may be added that odd-interval secondaries are characterized by having 
two cases in which a plain-text letter is enciphered by itself; that is, Op is identical with 6g. 
This phrase “identical with” will be represented by the symbol s ; the phrase “not identical 
with” will be represented by the symbol ^ . (Note that in secondary alphabet number 1 above, 
FpsFg and Up=Ug; in secondary alphabet number 3 above, Mp=M* and OpsOg). This charac- 
teristic will enable the cryptanalyst to select at once the proper two secondaries to work with in 
case several are available; one should show two cases where 6p=0a; the other should show 
none. 

g. (1) When the primary components are different mixed sequences, their reconstruction 
from secondary cipher alphabets follows along the same lines as set forth above, imder b to j, 
inclusive, with the exception that the selection of letters for building up the cbnin of equivalents 
for the primary cipher component is restricted to those below the zero line in the reconstruction 
skeleton. Having reconstructed the primary cipher component, the plain component can be 
readily reconstructed. This will become clear if the student will study the following example. 

0.„. ABCDEFGHIJKLMNOPQRSTUVWXYZ 
1 ..., TVABULIQXYCWSNDPFEZGRHJKMO 
2„. ZJSTVIQRMONKXEAGBWPLHYCDFU 

rrocBE 25. 

(2) Using only lines 1 and 2, the following chain is constructed: 

TZPGLIQRHYOUVJCNEWKDASXMFB 

This is an equivalent primary cipher component. By finding the values of the successive 
letters of this chain in terms of the plain component of secondary alphabet number 1 (the zero 
line), the followii^ is obtained: 

TZPGLIQRHYOUVJCNEWKDASXMFB 

ASPTFGHUVJZEBWKNRLXOCMIYQD 

The sequence A S P T . . . is an equivalent primary plain component. The original key- 
word mixed components may be recovered from each of the equivalent primary components. 
That for the primary plain component is based upon the key PUBLISHERS MAGAZINE; that for 
the primary cipher component is based upon the key QUESTIONABLY. 
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(3) Another method of accomplislung the process indicated above can be illustrated graphi- 
cally by the following two chains, based upon the two secondary alphabets set forth in sub- 
paragraph 2 (1): 

1 a 3 4 5 6 7 8 9 10 11 13 13 14 It 18 17 18 18 20 ai 23 23 24 25 26 

0 ABCDEFGHIJKLMNOPQRSTUVWXYZ 

1 TVABULIQXYCWSNDPFEZGRHJKMO 

2 ZJSTVIQRMONKXEAGBWPLHYCDFU 



Col.l. 

A (0-1) 




Col. 2. 

T (1-1); 


— > 


T (2-4) 


D (M); 




D (0-4) 


— > 


B (1-4); 


■ > 


B (2-17) 


Q (0-17); 




Q (0-17) 


-> 


F (1-17); 


— > 


F (2-25) -» 


Y (0-25); 




Y (0-25) 


— > 


M (1-25); 




M (2-9) 


I (0-9); 


— » 


I (0-9) 




X (1-9); 


— > 


X (2-13) 


M (0-13); 


— > 


M (0-13) 




S (1-13); 




S (2—3) — * 


C (0-3); 




etc. 




etc. 

FIGUBI 28. 







(4) By joioiDg the letters in Column 1, the following chain is obtained: A D Q Y I M, etc. 
If this be examined, it will be found to be an equivalent primary of the sequence based upon 
PUBLISHERS MAGAZINE. By joming the letters in Column 2, the following chain is obtained: 
T B F H X S. This is an equivalent primary of the sequence based upon QUESTIONABLY. 

r. A final word concerning the reconstruction of primary components in general may be 
added. It has been seen that in the case of a 26-element component sliding against itself (both 
components proceeding in the same direction), it is only the secondary alphabets resulting from 
odd-interval displacements of the primary components which permit of reconstructing a single 
26-letter chain of equivalents. This is true except for the 13th interval displ^ement, which 
even though an odd number, still acts like an even number displacement in that no complete 
chain of equivalents can be established from the secondary alphabet. This exception gives the 
clue to the basic reason for this phenomenon: it is that the number 26 has two factors, 2 and 13, 
which enter into the picture. With the exception of displacement-interval 1, any displacement 
interval which is a svh-mvltiple of, or has a factor in common wiih the number of letters in the primary 
sequence will yield a secondary alphabet from which no complete chain of 26 equivalents can he 
derived for the construction of a complete equivaleid primary component. This general rule is 
applicable only to components which progress in the same direction; if they progress in opposite 
directions, all the secondary alphabets are reciprocal alphabets and they behave exactly like 
the reciprocal secondaries resulting from the 13-interval displacement of two 26-letter identical 
components progressing in the same direction. 

8. The foregoing remarks give rise to the following observations based upon the general 
rule pointed out above. Whether or not a complete equivalent primary component is derivable 
by decimation from an original primary component (and if not, the lengths and numbers of chains 
of letters, or incomplete components, that can be constructed in attempts to derive such equiv- 
alent components) will depend upon the number of letters in the original primary component 
and the specific decimation mterval selected. For example, in a 26-letter ori^al primary com- 
ponent, decimation interval 6 will yield a complete equivalent primary component of 26 letters, 
whereas decimation intervals 4 or 8 will yield 2 chains of 13 letters each. In a 24-letter compo- 
nent, decimation interval 5 will also yield a complete equivalent primary component (of 24 letters), 
but decimation interval 4 will yield 6 chains of 4 letters each, and decimation interval 8 will 



REF ID:A64646 



59 



yield 3 chains of 8 letters each. It also follows that in the case of an original primary com- 
ponent in which the total number of characters is a prime number, aU decimation intervals will 
yield complete equivalent primary components. The following table has been drawn up in the 
light of these observa.tions, for original primary sequences from 16 to 32 elements. (All prime- 
number sequences have been omitted.) In this table, the column at the extreme left gives the 
various decimation inteiwals, omitting in each case the first interval, which merely gives the 
original primary sequence, and the last interval, which merely gives the original sequence 
reversed. The top line of the table gives the various lengths of original primary sequences from 
32 down to 16. (The student should bear in mind that sequences containing characters in addi- 
tion to the letters of the alphabet may be encountered; he can add to this table when he is 
interested in sequences of more than 32 characters.) The numbers within the table then show, 
for each combination of decimation interval and length of, original sequence, the lengths of the 
chains of characters that can be constructed. (The student may note the symmetry in each 
column.) The bottom line shows the total number of complete equivalent primary components 
which can be derived for each different length of original component. 
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Section VIII 

APPLICATION OF PRINCIPLES OF INDIRECT SYMMETRY OF POSITION 

Fatagnpb 
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Application of principles 35 

General remarks 36 



32. Applying the principles to a specific example. — a. The preceding section, with the 
many details covered, now forms a sufficient base for proceeding with an exposition of how the 
principles of indirect symmetry of position can be applied very early in the solution of a poly- 
alphabetic substitution cipher in which sliding primary components were employed to produce 
the secondary cipher alphabets for the enciphering of the cryptogram. 

b. The case described below will serve not only to explain the method of applying these 
principles but will at the same time show how their application greatly facilitates the solution 
of a single, rather difficult, polyalphabctic substitution cipher. It is realized, of course, that the 
cryptogram could be solved by the usual methods of frequency and long, patient experimentation. 
However, the method to be described was actually applied and very materially reduced the 
amount of time and labor that would otherwise have been required for solution. 

33. The cryptogram employed in the exposition. — a. The problem that will be used in this 
exposition involves an actual cryptogram submitted for solution in connection with a cipher 
device having two concentric disks upon which the same random mixed alphabet appears, both 
alphabets progressing in the same direction. This was obtained from a study of the descriptive 
circular accompanying the cryptogram. By the usual process of factoring, it was determined 
that the cryptogram involved 10 alphabets. The message as arranged according to its period 
is shown in Figure 27, in which all repetitions of two or more letters are indicated. 

b. The trUiteral frequency distributions are given in Figure 28. It will be seen that on 
account of the brevity of the message, considering the number of alphabets involved, the fre- 
quency distributions do not yield many clues. By a very careful study of the repetitions, 
tentative individual determinations of values of cipher letters, as illustrated in Figures 29, 30, 
31, and 32, were made. These are given in sequence and in detail in order to show that there is 
nothing artificial or arbitrary in the preliminary stages of analysis here set forth. 

( 60 ) 
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The Cbtptoqbam 


i 




(Repetitions underlined) 


t 


12345878S10 


128416786 10 


138466789 10 


A W FLU P C F d C J Y 


P RCVOPNBLCW 


EE BKDZFMTGQJ 


B GBZDPFBOUO 


Q L Q Z A A A C H 


FF LFIUYDTZV H Q 


C GRFTZMQMAV 


R BZZCKQ0IKF_ 


GG_ZGWNKXj T_R JL 


D KZUGDYFTRW 


S CFBSCVXCHQ 


HH YTXCDPMVLW 


E GJXNLWYOUX 


T ZTZSDMXWCM 


II BGBWWOQRGN 


F I O E P_Si Z 0 K Z 


U RKUHEQEDGX 


1 

JJ H H V L A Q Q V A_V ' 

!' 


G PRXDWLZICW 


V FKVHPJJKJY 


KKJQWOOTTNVQ | 

1 


H GKQHOLODVM 


W YQDPCJXLLL 


1 

LL B_K X_D S 0_Z R S IL 


I GQXSNZHASE 


X GHXEROQPSE 


1' 

i 

MM_YUX0JPPY_0XZ_ ; 


J BBJIPQ. FJHD 


Y_GKB_WTLFD U_Z 


1 

NN_H0Z0W_MX_CGQ | 


K 0 C B Z EX Q T X Z 


zocdhjlmzt u_z 


i 

OOJJUGDWORVM ' 


L JCQRQFVMLH 


AA KLBPCJOTXE 


'l 

PPUKJPEFXENF_ ^ 


M SRQEWMLNAE 


BB HSPOPNMDLM 


QQCCUGDWPEUH i! 
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Initial Values Fbou Assumptions 

12 3 8 

Ge=Ep; Ko=Ep; Xo=Ep; and De=Ep, from frequency considerations. 

348 486 SOI 

UGD=THE; PCJ=THE; and SEG=THE, from study of repetitions. 
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Additional Values from Assumptions (I) 

3 

Befer to line DD in Figure 29; So assumed to be N„. 

B 

Refer to line M in figure 29 ; A, assumed to be Wp. 

B 10 1 2 8 4 8 

Then in lines C-D, AVKZUGD is assumed to be WITH THE. 
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Additional Values fbom Assumptions (II) 

1234C87S9 10 

Refer to Figure 30, line A;WFUPCF0CJY; assume to be BUT THOUGH. 

TTH 

3 4 6 6 

Refer to Figure 30, lines N and X, whore repetition X E R 0 occurs; assume EACH 
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Additional Values From Assumptions (III) 

4 5 « 

OPN — assume ING from repetition and frequency. 

9 0 1 

HQZ — assume ING from repetition and frequency. 
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c. From the initial and subsequent tentative identifications shown in Figures 29, 30, 31, 
and 32, the values obtained were arranged in the form of the secondary alphabets in a reconstruc- 
tion skeleton, shown in Figure 33. 



1 2 3 4 S a 7 8 « 10 11 13 13 14 15 le 17 IS 19 20 21 22 23 24 26 28 
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34. Fundamental theory. — a. In paragraph 31, methods of reconstructing primary com- 
ponents from secondary alphabets were given in detail. It is necessary that those methods be 
fully understood before the following steps be studied. It was there shown that the primary 
component can be one of a series of equivalent primary sequences, all of which will give exactly 
similar results so far as the secondary alphabets and the cryptographic text are concerned. 
It is not necessary that the identical or original primary component employed in the crypto- 
graphing be reconstructed; any equivalent primary sequence will serve. The whole question is 
one of establishing a sequence of letters the interval between which is either identical with that 
in the original primary component or else is an exact constant multiple of the interval separating 
the letters in the original primary component. For example, suppose K P X N Q forms a 
sequence in the original primary component. Here the interval between K and P, and P and X, 
XandN, N and Q is one; in an equivalent primary component, say the sequence K . . P . . X 
. . N . . Q, the interval between K and Pis three, that between Pond X also three, and so on; 
and the two sequences will yield the same secondary alphabets. So long as the interval between 
K and P, P and X, X and N, N and Q, . . . , is a constant one, the sequence will be cryptographically 
equivalent to the original primary sequence and will yield the same secondary alphabets as do 
those of the original primary sequence. However, in the case of a 26-letter component, it is 
necessary that this interval be an odd number other than 13, as these are the only cases which 
will yield one unbroken sequence of 26 letters. Suppose a secondary alphabet to be as follows: 

. JPlain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

^ ^jcipher. X K N P 
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It can be said that the primary component contains the following sequences: 

XN KP NQ PX 

These, when united by means of their common letters, yield K P X N Q. 

Suppose also the following secondary alphabet is at hand: 

. JPlain ABCDEFGHIJKLMNOPQRSTUVWXYZ 

^ ^iCipher. P X K N 

Here the sequences PN, XQ, KX, and NZ can be obtained, which when united yield the two se- 
quences KXQ and PNZ. 

By a comparison of the sequences K P X N Q, K X Q, and PNZ, one can establish the 
following: 

K P X N Q 

K . X . Q 

P . N . Z 

It follows that one can now add the letter Z to the sequence, making it K P X N Q Z. 

b. The reconstruction of a primary component from one of the secondary alphabets by the 
process given in paragraph 31 requires a complete or nearly complete secondary alphabet. 
This is at hand only ajter a cryptogram has been completely solved. But if one could employ 
several very scant or skeletonized secondary alphabets simultaneously with the analysis of the 
cryptogram, one could then possibly build up a primary component from fewer data and thus 
solve the cryptogram much more rapidly than would otherwise be possible. 

c. Suppose only the cipher components of the two secondary alphabets (1) and (2) given 
above be placed into juxtaposition. Thus: 



1 2 8 4 6 8 7 8 2 10 11 12 IS 14 16 16 17 18 19 20 21 22 23 2t 26 26 



(1) X . K N P . . 

(2) P . . X K . N 



The sequences PX, XN, and KP are given by juxtaposition. These, when united, yield KPXN 
as part of the primary sequence. It follows, therefore, that one can employ the cipher components 
oj secondary alphabets as sources oj independent data to assist in building up the primary sequences. 
The usefulness of this point will become clearer subsequently. 

35. Application of principles. — a. Refer now to the reconstruction skeleton shown in 
Figure 33. Hereafter, in order to avoid all ambiguity and for ease in reference, the position of 
a letter in Figure 33 will be indicated as stated in footnote 1, page 56. Thus, N (6-7) refers to 
the letter N in line 6 and in colum n 7 of Figure 33. 
b, (1) Now, consider the following pairs of letters: 



E (0-5) 
G (0-7) 
fH (0-8) 
[0 (0-15) 



J (6-5) 

N (6-7) 

0 ( 6 - 8 ) 1 
F (6-15)1 



HO, 0F=H0F 



(One is able to use the line marked zero in Figure 33 since this is a mixed sequence sliding against 
itsdj.) 
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(2) The immediate results of this set of values will now be given. Having HOF as a sequence, 
with EJ as belonging to the same displacement interval, suppose HOF and EJ are placed into 
juxtaposition as portions of sliding components. Thus: 

Plain HOF. . . 

Cipher. E J . . . . 

When Hp=Ec, then 0p= J*. 

(3) Refer now to alphabet 10, Figure 33, where it is seen that Hp=Ee. Ths derived vodue, 
0p= Jo, can immediately be inserted in the same alphabet and substituted in the cryptogram. 

(4) The student may possibly get a clearer idea of the principles involved if he will regard 
the matter as though he were dealing with arithmetical proportion. For instance, given any 
three terms in the proportion 2:8=4:16, the 4th term can easily be found. Furthermore, given 
the pair of values on the left-hand side of the equation, one may find numerous pairs of 
values which may be inserted in the right-hand side, or vice versa. For instance, 2:8=4:16 
is the same as 2:8=5:20, or 9:36=4:16, and so on. An illustration of each of these principles 
will now be given, reference being made to Figure 33. As an example of the first principle, note 
that E ({1-5):H (0-8)=J. (6-5):0 (6-8). Now find E (10-8):H (0-8)=? (10-1 5): 0 (0-15). 

It is clear that J may be inserted as the 3d term in this proportion, thus giving the 
10 

uhportant new value, 0p= J*, which is exactly what was obtained directly above, by means of 
the partial sliding components. As an example of the second principle, note the following pairs: 

E (0-5) H (0-8) 

K (2-5) Z (2-8) 

D (5-5) C (5-8) 

J (6-5) 0 (6-8) 

These additional pairs are also noted: 

K (1-20) Z (1-7) 

T(0-2O) G(0-7) 

Therefore, E:H=K:Z=D:C=J:0=T:G, and T may be inserted in position (4r-5). 

e. (1) Again, GN belongs to the same set of displacement-interval values as do EJ and HOF. 
Hence, by superimposition: 

Plain HOF. . . 

Cipher . . . G N . . . . 

(2) Referring to alphabet 4, when Hp=Gc, then 0 p=Ne. Therefore, the letter N can be inserted 

4 

in position (4-15) in Figure 33, and the value Nc=0p can be substituted in the cryptogram. 

(3) Furthermore, note the corroboration found from this particular supeiimposition: 

H (0-8) G (0-7) 

0 (6-8) N (6-7) 

This checks up the value in alphabet 6, Gp=Np. 
d, (1) Again superimpose HOF and GN: 

...HOF... 

■ . . . G N ... 

(2) Note this corroboration: 

0 (6-8) G (4r-8) 

F (6-15) N (4-15) 

which has just been inserted in Figure 33, as stated above. 
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e. (1) Again using HOF and EJ, but in a different superimposition: 

. . .HOF. . 

..EtJ.... 

(2) Refer now to H (9-9), J (9-8). Directly under these letters is found V (10-9), E (10-8). 

Therefore, the V can be added immediately before HOF, making the sequence V H 0 F. 
y. (1) Now take V H 0 F and juxtapose it with E J, thus: 

. . . V H 0 F . . . 

...EtT... 

(2) Refer now to Figure 33, and find the following: 

V (10-9) E (10-8) 

H (9-9) J (9-8) 

0 (4-9) G (4-8) 

1 (0-9) H (0-8) 

(3) From the value 0 G it follows that G can be set next to J in E J. Thus: 

. . . V H 0 F . . . 

. .EtJG. . . 

(4) But G N already is known to belong to the same set of displacement-interval values 
as E J. Therefore, it is now possible to combine E J, J G, and G N into one sequence, E J G N, 
yielding: 

. . . V H 0 F . . . 

. . .EJGN. . . 

g. (1) Refer now to Figure 33. 

V (0-22) E (0-5) 

? (1-22) G (1-5) 

? (2-22) K (2-5) 

? (3-22) X (3-5) 

? (5-22) D (5-5) 

? (6-22) J (6-5) 

(2) The only values which can be inserted are: 

0 (1-22) G (1-5) 

H (6-22) J (6-5) 

(3) This means that Vp=0e in alphabet 1 and that V„=Hc in alphabet 6. There is one Op 
m the frequency distribution for alphabet 1, and no He in that for alphabet 6. The frequency 
distribution is, therefore, corroborative insofar as these values are concerned. 

(h) (1) Further, taking EJGN and V H 0 F, superimpose them thus: 

. . .EJGN. . . 

. . . V H 0 F . . . 

(2) Refer now to Figure 33. 

E (0-5) H (0-8) 

G (1-5) ? (1-8) 
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(3) From the diagram of superimposition the value G (1-5) F (1-8) can be inserted, which 
gives Hp=Fe in alphabet 1. 

i. (1) Again, V H 0 F and E J G N are juxtaposed: 

. . . V H 0 F . . . 

. . . E J G N . . . 



(2) Refer to Figure 33 and find the following: 



i ■ 

h! 


H (0-8) 




G 


(4-8) 


1 '' 


A (0-1) 




E 


(4r-l) 


This means that it is possible to add 


A, thus: 








J 

j • 


. . A V 


H 


0 


F . 


• 


. . E J 


G 


N 


• • 


(3) In the set there are also: 










i ■ 


E (0-5) 




G 


(1-6) 




G (0-7) 




Z 


(1-7) 


Then in the superimposition 










1 


• • • 


E 


J 


G N 



• • .E«TGN< . 



It is possible to add Z under G, making the sequence E J G N Z. 

(4) Then taking 

...AVHOF... 

. . . E J G N Z . . . 

and referring to Figure 33: 

H (0-8) N (0-14) 

0 (6-8) ? (6-14) 

It will be seen that 0=Z from superimposition, and hence in alphabet 6 Np=Zc, an important 
new value, but occurring only once in the cryptogram. Has an error been made? The work 
so far seems too corroborative in interlockii^ details to think so. 

j. (1) The possibilities of the superimposition and sliding of the AVHOF and the EJGNZ 
sequences have by no means been exhausted as yet, but a little different trail this timn may 
be advisable. 
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(0-20) 
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(1-20) 
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(3-5) 
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(3-20) 


(2) 


Then: 
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. E J 
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N Z . . 
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. T . 


K 


• • • 


(3) 


Now refer to the following: 
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(0-5) 


K 


(2-5) 
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(0-14) 


S 


(2-14) 
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whereupon the value S can be inserted: 



E J G N Z 
K . . S . 



k. (1) Consider all the values based upon the displacement interval corresponding to JQ: 



J (6-5) G (1-5) 
N (6-7) Z (1-7) 



J (9- 8) G (4- 8) 

H (9- 9) 0 {4r- 9) 

S (9-20) P (4-20)-» 



S (2-14) P (5-14) 
Z (2- 8) C (5- 8) 
K (2- 5) D (5- 5) 



(2) Since J and G are sequent in the E J G N Z sequence, it can be said that all the letten 
of the foregoing pairs are also sequent. Hence Z C, S P, and K D are available as new data. 
These give E J G N Z C and T . K D . S P. 

(3) Now consider: 



T (0-20) P (4-20) 

A (0- 1) E (4- 1) 

H (0- 8) G (4- 8) 

I (0- 9) 0 (4- 9) 

1 2 1 4 S 6 

Now in the T . K D . S P sequence the interval between T and P is T P. 

Hence the interval between A and E is 6 also. It follows therefore that the sequences A V H 0 F 
and E J G N Z C should be imited, thus: 

1 2 3 4 1 0 

...AVHOF.EJGNZC... 



(4) Corroboration is found in. the interval between H and G, which is also six. The letter X 
can be placed into position, from the relation I (0-9) 0 (4-9), thus: 

1 2 S 4 0 0 

...I.. AVHOF.EJGNZC... 

1. (1) From Figure 33: 

H (0- 8) Z (2- 8) 

E (0- 5) K (2- 5) 

N (0-14) S (2-14) 

U (0-21) F (2-21) 

(2) Since in the I . . AVHOF. EJGNZC sequence the letters H and Z are separated 
by 8 intervals one can write: 

12840078 

. . , H Z . . . 

... E ....... K .. . 

. . . N S . . . 

. . . U F . . . 
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(3) Hence one can make the sequence 

12845678 

. . .1. .AVHOF.EJGNZC. .K. . . 

Then . . .1. .AVHOF.EJGNZCT.KD.SP. . . 
and ..UI.. AVHOF.EJGNZCT.KD.SP... 

12345678 12845677 

m. (1) Subsequent derivations can be indicated very briefly as follows: 

E (0-5) C (0-3) 

D (5-5) R (5-3) 



1 2 8 4 5 6 7 8 6 16 11 12 13 14 IS 16 17 18 18 20 21 22 23 24 25 26 

From UI. .AVHOF.EJGNZCT.KD.SP. . . 
one can write ... E .... C .. . 

1 2 8 4 5 

and ... D .... R . 



1 2 8 4 5 

making the sequence 

1 2 3 4 5 6 7 8 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 

UI. .AVHOF.EJGNZCT.KD.SP.R. 



(2) Another derivation: 



U (3-20) T (0-20) 

X (3- 5) E (0- 5) 



1 2 8 4 5 6 7 8 9 10 11 12 13 14 15 16 17 IS 18 20 21 22 23 24 25 26 

From UI. .AVHOF.EJGNZCT.KD.SP.R. 
one can write 

U I T . . . 

and E X 



making the sequence 



1 2 a 4 5 6 7 8 9 10 11 12 13 14 16 16 17 18 18 20 21 22 23 24 25 26 

UI. .AVHOF.EJGNZCT.KDXSP.R. 



(3) Another derivation: 



From 

one can write 
and then 



E (0-5) G (1-5) 
B (0-2) W (1-2) 

. E J G . . . 

. E . G . . . 

. B . W . . . 



There is only one place where B . ff can fit, viz, at the end: 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 18 20 21 22 23 24 26 26 

UI. .AVHOF.EJGNZCT.KDXSPBRW 



n. Only four letters remain to be placed into the sequence, viz, L, M, Q, and 
positions are easily found by application of the primary component to the message, 
plete sequence is as follows: 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 16 16 17 18 19 20 21 22 23 24 26 26 

UIMYAVHOFLEJGNZCTQKDXSPBRW 



y. Their 
The com- 
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Having the primary component fully constructed, decipherment of the 
completed with speed and precision. The text is as follows: 



WFUPCFOCJY 

BUTTHOUGHW 

GBZDPFBOUO 

ECANNOTASY 

GRFTZMQMAV 

ETREVIEWWI 

KZUGDYFTRW 

THTHEMINDS 

GJXNLWYOUX 

EYEOURPAST 

ITWEPQZOKZ 

WECANTOANE 

PRXCWLZICW 

XTENTFORES 

GKQHOLODVM 

EEOURFUTUR 

GOXSNZHASE 

EWECANWITH 

BBJIPQFJHD 

SCIENTIFIC 

QCBZEXQTXZ 

CONFIDENCE 

JCQRQFVMLH 

LOOKFORWAR 

SRQEWMLNAE 

DTOATIMEWH 

GSXEROZJSE 

ENEACHOFTH 

GVQWEJMKGH 

EBODIESCOM 



RCVOPNBLCW 

POSINGTHES 

LQZAAAMDCH 

OLARSYSTEM 

BZZCKQOIKF 

SHALLTURNA 

CFBSCVXCHQ 

NUNCHANGIN 

ZTZSDMXWCM 

GFACEINPER 

RKUHEQEDGX 

PETUITYTOT 

FKVHPJJKJY 

HESUNEACHW 

YQDPCJXLLL 

ILLTHENHAV 

GHXEROQPSE 

EREACHEDTH 

GKBWTLFDUZ 

EENDOFITSE 

OCDHWMZTUZ 

VOLUTIONSE 

KLBPCJOTXE 

TINTHEUNCH 

HSPOPNMDLM 

ANGINGSTAR 

GCKWDVBLSE 

EOFDEATHTH 

GSUGDPOTHX 

ENTHESUNIT 

FiaVBE 34. 



cryptogram can he 



BKDZFMTGQJ 

SELFWILLGO 

LFUYDTZVHQ 

OUTBECOMIN 

ZGWNKXJTRN 

GACOLDANDL 

YTXCDPMVLW 

IFELESSMAS 

BGBWWOQRGN 

SANDTHESOL 

HHVLAQQVAV 

ARSYSTEMWI 

JQWOOTTNVQ 

LLCIRCLEUN 

BKXDSOZRSN 

SEENGHOSTL 

YUXOPPYOXZ 

IKEINSPACE 

HOZOWMXCGfi 

AWAITINGON 

JJUGJWQRVM 

LYTHERESUR 

UKWPEFXENF 

RECTIONOFA 

CCUGDWPEUH 

NOTHERCOSM 

YBWEWVMDYJ 

ICCATASTRO 

R Z X 
P H E 



0. The primary component appears to be a random-mixed sequence; no key word is to be 
foimd, at least none reappears on experimentation with various hypotheses as to enciphering 
equations. Nevertheless, the random construction of the primary component did not compli- 
cate or retard the solution. 
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2>. Some students may prefer to work exclusively with the reconstruction skeleton, rather 
than with sliding strips. One method is as good as the other and personal preferences will dictate 
which will be used by the individual student. If the reconstruction skeleton is used, the original 
letters should be inserted in ink, so as to differentiate them from derived letters. 

36. General remarks. — a. It is to be stated that the sequence of steps described in the 
preceding paragraphs corresponds quite closely with that actually followed in solving the prob- 
lem. It is also to be pointed out that this method can be used as a control in the early stages 
of analysis because it will allow the cryptanalyst to check assumptions for values. For example, 
the very first value derived in applying the principles of indirect symmetry to the problem 
herein described was Ho=Ap in alphabet 1. As a matter of fact the writer had been inclined 
toward this value, from a study of the frequency and combinations which H, showed; when the 
indirect-symmetry method actually substantiated his tentative hypothesis he immediately 
proceeded to substitute the value given. If he had assigned a different value to H^, or if he had 
assumed a letter other than He for A„ in that alphabet, the conclusion would immediately follow 
that either the assumed value for H* was erroneous, or that one of the values which led to the 
derivation of Ho=Ap by indirect symmetry was wrong. Thus, these principles aid not only in 
the systematic and nearly automatic derivation of new values (with only occasional, or incidental 
references to the actual frequencies of letters), but they also assist very materially in serving as 
corroborative checks upon the validity of the assumptions already made. 

b. Furthermore, while the writer has set forth, in the reconstruction skeleton in Figure 33, 
a set of 30 values apparently obtained before he began to reconstruct the primary component, 
this was done for purposes of clarity and brevity in exposition of the principles herein described. 
As a matter of fact, what he did was to watch very carefully, when inserting values in the recon- 
struction skeleton to find the very first chance to employ the principles of indirect symmetry; 
and just as soon as a value could be derived, he substituted the value in the cryptographic text. 
This is good procedure for two reasons. Not only will it disclose impossible combinations but 
also it gives opportunity for making further assumptions for values by the addition of the derived 
valu^ to those previously assumed. Thus, the processes of reconstructing the primary com- 
ponent and finding additional data for the reconstruction proceed simultaneously in an ever- 
widening circle. 

e. It is worth noting that the careful analysis of only 30 cipher equivalents in the recon- 
struction skeleton shown in Figure 33 results in the derivation of the entire table of secondary 
alphabets, 676 values in all. And while the elucidation of the method seems long and tedious, in 
its actual application the results are speedy, accurate, and gratifying in their corroborative effect 
upon the mental activity of the cryptanalyst. 

d. (1) The problem here used as an illustrative case is by no means one that most favorably 
presents the application and the value of the method, for it has been applied in other cases with 
much speedier success. For example, suppose that in a cryptogram of 6 alphabets the equivalents 
of only THE in all 6 alphabets are fairly certain. As in the previous case, it is supposed that the 
secondary alphabets are obtained by sliding a mixed alphabet against itself. Suppose the sec- 
ondary alphabets to be as follows: 
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0 
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X 


Y 


Z 


1 










B 






Q 
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N 






P 
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X 






0 
























P 














6 










T 






Z 
























V 















Fiqubi U. 



(2) Consider the followii^ chain of derivatives arranged diagrammaticallj: 
H (0- 8) 0 (5- 8) 



T (0-20) 
E 6) 



-^P 

0 



P (5-20) 

X (5- 5)-^E (1-20) X (2-20) 
Q (1- 8) L (2- 8) 
B (1- 6) C (2- 6)- 



>B (4-20) 
N (4- 6) 
P (4^ 8) 



( 6 - 20 ) 
(5- 8) 



X (5- 5) 



V (6-20) 
Z (6- 8) 
T (6- 5)- 



>X (2-20) 
L (2- 8) 
C (2- 5) 



T (0-20) 
H (0- 8) 
E (0- 5)- 



>C 

V 

I 



(3-20) 
(3- 8) 
(3- 6) 



(3-20) 
(3- 5) 
(3- 8)- 



E (1-20) 
Q (1- 8) 
B (1- 5) 



Fioubi 30. 



(3) These pairs manifestly all belong to the same displacement interval, and therefore 
unions can be made immediately. The complete list is as follows: 

EX, Q L. N I, L H, H 0, B C. 0 Z, C E. TP. P V. X T, V Q. IB 

(4) Joining pairs by their common letters, the following sequence is obtained: 

. . .NIBCEXTPVQLHOZ. . . 



e. With this as a nucleus the cryptogram can be solved speedily and accurately. When 
it is realized that the cryptanalyst can assume THE’s rather readily in some cases, the value of 
this principle becomes apparent. When it is further realized that if a cryptogram has sufficient 
text to enable the THE’s to be found easily, it is usually also not at all difficult to make correct 
assumptions of values for two or three other high-frequency letters, it is clear that the principles 
of indirect symmetry of position may often be used with gratifyingly quick success to reconstruct 
the complete primary component. 

/. When the probable-word method is combined with the principles of indirect symmetry 
the solution of a difficult case is often accomplished with astonishing ease and rapidity. 

162018 — 38 — 0 
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Section IX 

REPEATING-KEY SYSTEMS WITH MIXED CIPHER ALPHABETS, III 

Paragraph 



I i Solution of messages enciphered by known primary components 37 

Solution of repeating-key ciphers in which the identical mixed components proceed in opposite directions 38 

Solution of repeating-key ciphers in which the primary components are different mixed sequences 39 

i i Solution of subsequent messages after the primary components have been recovered 40 



37. Solution of subsequent messages enciphered by the same primary components. — a. In 
the discussion of the methods of solving repeating-key ciphers using secondary alphabets derived 
from the sliding of a mixed component against the normal component (Section V), it was shown 
how subsequent messages enciphered by the same pair of primary components but with different 
keys could be solved by application of principles involving the completion of the plain-component 
sequence (paragraphs 23, 24). The present paragraph deals with the application of these same 
principles to the case where the primary components are identical mixed sequences. 

b. Suppose that the following primary component has been reconstructed from the analysis 
of a lengthy crjrptogram: 

QUESTIONABLYCDFGHJKMPRVWXZ 

A new message exchanged between the same correspondents is intercepted and is suspected 
of having been enciphered by the same primary components but with a different key. The 
message is as follows: 

N F W W P N 0 M K I W P I D S C A A E T Q V Z S E 

Y 0 J SC A A A F G R V N H D W D 5 C A E G N F P 

FOEMT HXLJW P N 0 M K I Q D B J I V N H L 

TFNCS BGCRP 

c. Factoring discloses that the period is 7 letters. The text is transcribed accordingly, and 
is as follows: 

N F W W P N 0 

M K I W P I D 

S C A A E T Q 

V Z S E Y 0 J 

S C A A A F G 

R V N H D W D 

S C A E G N F 

P F 0 E M T H 

X L J W P N 0 

M K I Q D B J 

I V N H L T F 

N C S B G C R 

P 

FtQCBX 37. 

(78) 
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d. The letters belonging to the same alphabet axe then employed as the initial letters of 
completion sequences, in the manner shown in paragraph 23a, using the already reconstructed 
primary component. The completion diagrams for the first five letters of the first three alphabets 
are as follows: 



ALraABBT 1 

N M S V S 
A P T W T 
B R I X I 
L V 0 Z 0 

Y W N Q N 
C X A U A 
D Z B E B 
F Q L S L 
G U Y T Y 

*H E C I C 
J S D 0 D 
K T F N F 
H I G A G 
P 0 H B H 
R N J L J 

V A K Y K 
W B M C M 
X L P D P 
Z Y R F R 
Q C V G V 
U D W H W 
E F X J X 
S G Z K Z 
T H Q M Q 
I J U P U 
0 K E R E 



Alphabit 2 

F K C Z C 
G M D Q D 
H P F U F 
J R G E G 
K V H S H 
M W J T J 
P X K I K 
R Z M 0 M 

V Q P N P 
W U R A R 
X E V B V 
Z S W L W 
Q T X Y X 
U I Z C Z 
E 0 Q D Q 
S N U F U 
T A E G E 
I B S H S 
0 L T J T 
N Y I K I 

*A C 0 M 0 
B D N P N 
L F A R A 

Y G B V B 
C H L W L 
D J Y X Y 



Alpbabbi 3 

W I A S A 
X 0 B T B 
Z N L I L 
Q A Y 0 Y 
U B C N C 
E L D A D 
S Y F B F 
T C G L G 
I D H Y H 
0 F J C J 
N G K D K 
A H H F M 
B J P G P 
L K R H R 
Y M V S V 
C P W K W 
D R X M X 
F V Z P Z 
G W Q R Q 
H X U V U 
J Z E W E 
K Q S X X 
M U T Z T 
P E I Q I 
R S 0 U 0 
*V T N E N 



Figubb 38, 

e. Examining the successive generatives to select the ones showing the best assortment of 
high-frequency letters, those marked in Figure 38 by asterisks are chosen. These are then assem- 
bled in columnar fashion and yield the following plain text; 



H A V 
E C T 
CON 
I U E 
CON 



Figubb 3S. 
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y. The corresponding key-letters are sought, using enciphering equations 0»/e=6i/p; 0pA>= 
0e/e, and are foimd to be JOU, which suggests the kesrword JOURNEY. Testing the key-letters 
for alphabets 4, 5, 6, and 7, the following results are obtained: 

1 3 S 4 5 6 7 

JOURNEY 
N F W W P N 0 

H A V E D I R 

S C A A E T Q 

E C T E D S E 



Fiacxs40. 

The message may now be completed with ease. It is as follows: 



JOURNEY 
H A V E D I R 
N F W W P N 0 

E C T E D S E 
M K I W P I D 

C 0 N D R E G 
S C A A E T Q 

I M E N T T 0 
V Z S E Y 0 J 

CONDUCT 
S C A A A F 6 

T H 0 R 0 R E 
R V N H D W D 

C 0 N N A I S 
S C A E G N F 



JOURNEY 
S A I N C E I 
P F 0 E M T H 

N T H E D I R 
X L J W P N 0 

E C T I 0 N 0 
M K I Q D B J 

F H 0 R S E S 
I V N H L T F 

H 0 E F A L L 
N C S B G C R 

S 

P 



Fiausi 41. 



38. Solution of repeating-key ciphers in which the identical mixed components proceed in 
opposite directions. — The secondary alphabets in this case (paragraph 6, Case B (3) (a) (II) 
are reciprocal. The steps in solution are essentially the same as in the preceding case (para- 
graph 28); the principles of indirect symmetry of position can also be applied with the necessary 
modifications introduced by virtue of the reciprocity existing within the respective secondary 
alphabets (paragraph Zip). 

39. Solution of repeating-key ciphers in which the primary components are different mixed 
sequences. — This is Case B (3) (b) of paragraph 6. The steps in solution are essentially the same 
as in paragraphs 28 and 31, except that in applying the principles of indirect symme^ of posi- 
tion it is necessary to take cognizance of the fact that the primary components are different 
mixed sequences (paragraph 31q). 

40. Solution of subsequent messages after the primary components have been recovered. — 
a. In the case in which the primary components are identical mixed sequences proceeding in 
opposite directions, as well as in that in which the primary components are different mixed 
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sequences, the solution of subsequent messages ^ is a rdatively easy matter. In both cases, how- 
ever, the student must remember that before the method illustrated in paragraph 37 can be 
applied it is necessary to convert the cipher letters into their plain-component equivalents 
before completing the plain-component sequence, ^om there on, the process of selecting and 
assembling the proper generatrices is the same as usual. 

b. Perhaps an example may be advisable. Suppose the enemy has been found to be usii^ 
primary components based upon the keyword QUESTIONABLY, tbe plain component running 
from left to right, the cipher component in the reverse direction. The following new message 
has arrived from the intercept station: 



M V X 0 X 


B Z I Y Z 


N L W Z H 


0 X I E 0 


0 0 E P Z 


F X S R X 


E J B S H 


BONAIL- 


R A P Z I 


N R^ M V 


X 0 X A I 


J Y X W F 


K N D 0 ff 


J E R C U_ 


R A L V B 


Z A Q U V 


J W X Y I 


D G R K D 


Q B D R M 


Q E C Y V 



Q W 



1 8 « 4 a 6 
U V X 0 X B 
Z I Y Z N L 
W Z H 0 X I 
E 0 0 0 E P 
Z F X S R X 
E J B S H B 

0 N A U R A 
P Z I N R A 
M V X 0 X A 

1 J Y X W F 
K N D 0 W J 
E R C U R A 
L V B Z A Q 
U W J ff X Y 
I D G R K D 
Q B D R M Q 
E C Y V Q W 



c. Factoring discloses that the period is 6 and the mes- 
sage is accordingly transcribed into 6 columns, Fig. 42. 
The letters of these columns are then converted into their 
plain component equivalents by juxtaposing the two pri- 
mary components at any point of coincidence, for ex- 
ample Qp=Z,. The converted letters are shown in Fig. 43. 
The letters of the individual columns are then used as the 
initial letters of completion sequences, using the 
QUESTIONABLY primary sequence. - The final step is the 
selection and assembling of the selected generatrices. 
The results for the first ten letters of the first three columns 
are shown below: 



1 8 8 4 8 8 

0 S U M U H 
Q P F Q K G 
E Q B M U P 
W M M M W I 
Q Y U V T U 
W A H V B H 
M K J X T J 

1 Q P K T J 
0 S U M U J 
P A F U E Y 
N K C M E A 
W T D X T J 
G S H Q J Z 
X E A E U F 
P C L T N C 
Z H C T 0 Z 
W D F S Z E 



Viaxjsc 4X 



riOUBB 48. 



1 That is, messages intercepted after the primary components have been reconstructed and enciphered by 
keys different from those used in the messages upon which the reconstruction of the primary components was 
accomplished. 
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0 Q E 


COLVHN 1 

W Q W M 
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S_ 




COLUIM 2 

Q M Y A K Q S 
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Fiodbi 44. 

Columnar assembling of selected generatrices gives what is shown in Fig. 45. 

1 2 3 4 5 8 

FIR. . . 

A V A . . . 

L E S . . . 

I R D . . . 

ADR. . . 

ILL. . . 

U P Y . . . 

D E F . . . 

FIR. . . 

E L A . . . 



Tiauu 45. 
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d. The key letters are sought, and found to be NUM, which suggests NUMBER. The entire 
message may now be read with ease. It is as follows: 



NUMBER 
F I R S T C 
M V X 0 X B 
A V A L R Y 
Z I Y Z N L 
L E S S T H 
W Z H 0 X I 
I R D S Q U 
E 0 0 0 E P 
A D R 0 N W 
Z F X S R X 
I L L 0 C C 
E J B S H B 
U P Y A N D 
0 N A U R A 
DEFEND 
P Z I N R A 
F I R S T D 
M V X 0 X A 



NUMBER 
E L A Y I N 
I J Y X W F 
G P 0 S I T 
K N D 0 W J 
I 0 N A N D 
E R C U R A 
W I L L P R 
L V B Z A Q 

0 T E C T L 
U W J W X Y 
E F T F L A 

1 D G R K D 
N K 0 F B R 
Q B D R M Q 
I G A D E X 
E C Y V Q W 



FlOUBE 46. 

e. If the piimaiy components are different mixed sequences, the procedure is identical with 
that just indicated. The important point to note is that one must not fail to convert the letters 
into their plain-component equivalents before the completion-sequence method is applied. 
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REPEATING-KEY SYSTEMS WITH MIXED CIPHER ALPHABETS, IV 

Faiagraph 



General remarks 41 

Deriving the secondary alphabets, the primary components, and the key, given a cryptogram with its 

plain text 42 

Deriving the secondary alphabets, the primary components, and the keywords for messages, given two or 

more cryptograms in different keys and suspected to contain identical plain text 43 

The case of repeating-key systems 44 

The case of identical messages enciphered by keywords of different lengths 45 

Concluding remarks 46 



41. General remarks. — The preceding three sections have been devoted to an elucidation 
of the general principles and procedure in the solution of typical cases of repeating-key ciphers. 
This section will be devoted to a consideration of the variations in cryptanalytic procedure arising 
from special circumstances. It may be well to add that by the designation “special circum- 
stances” it is not meant to imply that the latter are necessarily unusual circumstances. The 
student should alvxiys be on the alert to seize upon any opportunities that may appear in which he may 
apply the methods to he descrUed. In practical work such opportunities are by no means rare and 
are seldom overlooked by competent cryptanalysts. 

42. Deriving the secondary alphabets, the primary components, and the key, given a 
cryptogram with its plain text. — a. It may happen that a cryptogram and its equivalent plain 
text are at hand, as the result of capture, pilferage, compromise, etc. This, as a general rule, 
affords a very easy attack upon the whole system. 

b. Taking first the case where the plain component is the normal alphabet, the cipher com- 
ponent a mixed sequence, the first thing to do is to write out the cipher text with its letter-for- 
letter decipherment. From this, by a sUght modification of the principles of “factoring”, one dis- 
covers the length of the key. It is obvious that when a word of three or four letters is enciphered 
by the same cipher text, the interval between the two occurrences is almost certainly a multiple 
of the length of the key. By noting a few recurrences of plain text and cipher letters, one can 
quickly determine the length of the key (assuming of course that the message is long enough to 
afford sufficient data). Having determined the length of the key, the message is rewritten accord- 
ing to its periods, with the plain text likewise in periods under the cipher letters. From this 
arrangement one can now reconstruct complete or partial secondary alphabets. If the secondary 
alphabets are complete, they will show direct symmetry of position; if they are but fragmentary 
in several alphabets, then the primary component can be reconstructed by the application of the 
principles of direct symmetry of position. 

e. If the plain component is a mixed sequence, and the cipher component the normal (direct or 
reversed sequence), the secondary alphabets will show no direct symmetry unless they are ar- 
ranged in the form of deciphering alphabets (that is. A, . . . Z, above the zero line, with their 
equivalents below). The student should be on the lookout for such cases. 

d. (1) If the plain and cipher primary components are identical mixed sequences proceeding 
in the same direction, the secondary alphabets will show indirect symmetry of position, and they 
CfW be heed for the speedy reconstruction of the primary components (Paragraph 3fa to o). 

( 84 ) 
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(2) If the plain and the cipher primary components are identical mixed sequences proceeding 
in opposite directions, the secondary alphabets will be completely reciprocal secondary alphabets 
and the primary component may be reconstructed by applying the principles outlined in para- 
graph 31p. 

(3) If the plain and the cipher pnmary components are different mixed sequences, the 
secondary alphabets will show indirect symmetry of position and the primary components may 
be reconstructed by applying the principles outlined in paragraph 31g. 

e. In all the foregoing cases, after the primary components have been reconstructed, the 
keys can be readily recovered. 

43. Deriving the secondary alphabets, the primary components, and the keywords for 
messages, given two or more cryptograms in different keys and suspected to contain identical 
plain text. — a. The simplest case of this kind is that involving two monoalphabetic substitution 
ciphers with mixed alphabets derived from the same pair of sliding components. An understand- 
ing of this case is necessary to that of the case involving repeating-key ciphers. 

h. (1) A message is transmitted from station A to stations. B then sends A some operating 
signals which indicate that B cannot decipher the message, and soon thereafter A sends a second 
message, identical in length with the first. This leads to the suspicion that the plain text of both 
messages is the same. The intercepted messages are superimposed. Thus: 

1. NXGRV MPUOF ZQVCP VWERX QDZVX WXZQE TBDSP WXJK RFZWH ZUWLU lYVZQ FXOAR 

2. EMLHJ FGVUB PRJNG JKWHM RAPJM KMPRW ZTAXG JJMCD HBPKY PVKIV QOJPR BMUSH 

(2) Initiating a chain of cipher-text equivalents from message 1 to message 2, the following 
complete sequence is obtained: 

1 3 3 4 8 « 7 8 B 10 11 U 13 14 15 18 17 18 » 20 21 22 28 21 25 28 

NEWKDASXMFBTZPGL, IQRHYOUVJ C 

(3) Experimentation along already-indicated lines soon discloses the fact that the foregoing 
component is an equivalent primary component of the ori^al primary based upon the keyword 
QUESTIONABLY, decimated on the 21st interval. Let the student decipher the cryptogram. 

(4) The foregoing example is somewhat artificial in that the plain text was consciously 
selected with a view to making it contain every letter of the alphabet. The purpose in doing 
this was to permit the construction of a complete chain of equivalents from only two short 
messages, in order to give a simple illustration of the principles involved. If the plain-text message 
does not contain every letter of the alphabet, then only partial chains of equivalents can be con- 
structed. These may be united, if circumstances will permit, by recourse to the various prin- 
ciples elucidated in paragraph 31. 

(5) The student should carefully study the foregoing example in order to obtain a thorough 
comprehension of the reason why it was possible to reconstruct the primary component from the 
two cipher messages without having any plain text to be^ with at all. Since the plain text of 
both messages is the same, the relative displacement of the primary components in the case of 
message 1 differs from the relative displacement of the same primary components in the case of 
message 2 by a fixed interval. Therefore, the distance between N and E (the first letters of the 
two messages), on the primary component, regardless of what plain-text letter these two 
cipher letters represent, is the same as the distance between E and W (the 18th letters), W and K 
(the 17 th letters), and so on. Thus, this fixed interval permits of establishing a complete chain 
of letters separated by constant intervals and this chain becomes an equivalent primary com- 
ponent. 
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44. The case of repeating-key systems. — a. With the foregoing basic principles in mind 
the student is ready to note the procedure in the case of two repeating-key ciphers having identical 
plain texts. First, the case in which both messages have keywords of identical length but different 
compositions will be studied. 

b. (1) Given the following two cryptograms suspected to contain the same plain text: 



Y H Y E X 
P C Q T U 
T R 0 Q S 



C G S L Z 
M T A I Q 
V S N Z R 



U B U K A 
N G K F A 
U H A F K 



Q U B M N 
Z W M D Q 
B J N 0 Q 



Mbbsaqe 1 

P V L L T 
Z E F I Z 

Mbbbaqb 2 

C T Y B V 
N S D W N 



A B U V V 
B D J E Z 



H L Q F T 
L C B L Q 



D Y S A B 
A L V I D 



F L R H L 
N E T 0 C 



(2) The first step is to try to determine the length of the period. The usual method of 
factoring cannot be employed because there are no long repetitions and not enough repetitions 
even of digraphs to give any convincing indications. However, a subterfuge will be employed, 
based upon the theory of factoring. 



e. (1) Let the two messages be superimposed. 



1 3 3 4 t 8 7 8 • 10 11 12 13 14 16 16 17 18 19 20 

1. YHYEXUBUKAPVLLTABUVV 

2. CGSLZQUBMNCTYBVHLQFT 

21 22 23 24 25 28 27 28 29 30 31 32 33 34 35 36 37 38 30 40 

1. DYSABPCQTUNGKFAZEFIZ 

2. FLRHLMTAIQZWMDQNSDWN 

41 42 43 44 45 46 47 48 49 50 51 62 63 54 55 56 57 58 69 60 

1. BDJEZALVIDTROQSUHAFK 

2. LCBLQNETOCVSNZRBJNOQ 

4 44 

E E 

(2) Now let a search be made of cases of identical superimposition. For example, L and L 

6 18 30 

U U U 

are separated by 40 letters, Q, Q, and Q are separated by 12 letters. Let these intervals between 
identical superimpositions be factored, just as though they were ordinary repetitions. That 
factor which is the most frequent should correspond with the length of the period for the following 
reason. If the period is the same and the plain text is the same in both messages, then the con- 
dition of identity of superimposition can only be the result of identity of encipherments by 
identical cipher alphabets. This is only another way of saying that the same relative position in 
the keying cycle has been reached in both cases of identity. Therefore, the distance between 
identical superimpositions must be either equal to or else a multiple of the length of the period. 
Hence, factoring the intervals must yield the length of the period. The complete list of intervals 
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and factors applicable to cases of identical superimposed pairs is as follows (factors above 12 
are omitted): 



Repetition 


Interval 


Factors 


Repetition 


Interval 


Factors 


1st EL tn 2H EL 


40 


2, 4, 5, 8, 10. 


1st TV to 2d TV. 


36 


2, 3, 4, 6, 9, 12. 


1st up 2H lip 


12 


2, 3. 4, 6, 12. 


1st AH to 2d AH 


8 


2, 4, 8. 


2H up tn an up 


12 


2, 3. 4, 6, 12. 


1stBLtn2HBL 


8 


2, 4, 8. 


1st IIB tn 2H im 


48 


2, 3, 4, 6, 8, 12. 


2d BL tn .3d BL 


16 


2, 4, 8. 


1st KM tn 2H KM 


24 


2, 3, 4, 6, 8, 12. 


1st SR to 2d SR. 


32 


2, 4, 8. 


1st AN tn 2r1 AN 


36 


2, 3, 4, 6, 0, 12. 


1st ED tn 2d ED 


4 


2, 4. 


2H AN tn an AN 


12 


2, 3, 4, 6, 12. 


1st ZN to 2d ZN 


4 


2, 4. 


1st VT tn 2H VT 


8 


2, 4, 8. 


1st DC to 2d DC 


8 


2, 4, 8. 


2d VT to 3d VT 


28 


2 , 4 , 7. 









(3) The factor 4 is the only one common to every one of these intervals and it may be taken 



as beyond question that the length of the period is 4. 

d. Let the messages now be superimposed according to their periods: 

1234 1 1 3 4 1334 1334 1334 1234 1234 





1. Y H Y E 


X U B U 


K A 


P 


V 


L 


L 


T A 


B U V V 


D 


Y S A 


B P C Q 




2. C G S L 


Z Q U B 


M N 


C 


T 


Y 


B 


V H 


L Q F T 


F 


L R H 


L M T A 




1. TUNG 


K F A Z 


E F 


I 


Z 


B 


D 


J E 


Z A L V 


I 


D T R 


0 Q S U 




2. I Q Z W 


M D Q N 


S D 


W 


N 


L 


C 


B L 


Q N E T 


0 


CVS 


N Z R B 



1. H A F K 

2. J N 0 Q 

e. (1) Now distribute the superimposed letters into a reconstruction skeleton of “secondary 
alphabets.” 

Thus: 



0 


A 


B 


c 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


S 


T 


u 


V 


ff 


X 


Y 


Z 


1 




L 




F 


S 






J 


0 




M 


Y 






N 










I 








z 


C 


Q 


2 


N 






C 




D 




G 








B 








M 


z 








Q 








L 




3 


Q 


U 


T 






0 






W 


B 




E 




z 




C 






R 


V 




F 






S 




4 


H 








L 




W 








Q 












A 


S 






B 


T 








N 



(2) By the usual methods, construct the primary or an equivalent primary component. 
Taking lines 0 and 1, the following sequences are noted: 

BL, DF. ES. HJ, 10, KM, LY, ON, TI, XZ, YC, ZQ, 

which, when united by means of common letters and study of other sequences, yield the complete 
original primary component based upon the keyword QUESTIONABLY : 

QUESTIONABLYCDFGHJKMPRVWXZ 

(3) The fact that the pair of lines with which the process was commenced yield the original, 
primary sequence is purely accidental; it might have just as well yielded an equivalent primary 
sequence. 
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y. (1) Having the primary component, the solution of the messages is now a relatively simple 
matter. An application of the method elucidated in paragraph 37 is made, involving the comple- 
tion of the plain-component sequence for each alphabet and selecting those generatrices which 
contain the best assortments of high-frequency letters. Thus, using Message 1: 



ITntm Alphabst 

Y X K L B 


SxcoHD Aitrabit 

H U A L U 


Thibd Alphabet 

Y B P T V 


Foubte Alphabet 

E U V A V 


C 


Z 


M 


Y 


L 


J 


E B Y 


E 


C 


L 


R I 


w 


S 


E 


w 


B 


W 


D 


Q 


P 


C 


Y 


K 


S L C 


S 


D 


Y 


V 0 


X 


T 


S 


X 


L 


X 


F 


U 


R 


D 


C 


M 


T Y D 


T 


F 


C 


W N 


z 


I 


T 


z 


Y 


z 


G 


E 


V 


F 


D 


p 


IGF 


I 


G 


D 


X A 


Q 


0 


I 


Q 


C 


Q 


H 


S 


W 


G 


F 


R 


0 D G 


0 


H 


F 


ZB 


u 


N 


0 


u 


D 


u 


J 


T 


X 


H 


G 


V 


N F H 


N 


J 


G 


Q L 


E 


*A 


N 


E 


F 


E 


K 


I 


z 


J 


H 


W 


A G J 


A 


K 


H 


U Y 


S 


B 


A 


S 


G 


S 


M 


0 


Q 


K 


J 


X 


B H K 


B 


M 


J 


E C 


T 


L 


B 


T 


H 


T 


P 


N 


u 


M 


K 


z 


L J M 


L 


P 


K 


S D 


I 


Y 


L 


I 


J 


I 


R 


A 


E 


P 


M 


Q 


Y K P 


Y 


R 


M 


T F 


0 


C 


Y 


0 


K 


0 


V 


B 


S 


R 


P 


u 


C M R 


C 


V 


P 


I G 


N 


D 


C 


N 


M 


N 


W 


L 


T 


V 


R 


E 


D P V 


D 


W 


R 


0 H 


A 


F 


D 


A 


P 


A 


X 


Y 


I 


W 


V 


S 


F R W 


F 


X 


V 


N J 


B 


G 


F 


B 


R 


B 


z 


C 


0 


X 


W 


T 


G V X 


G 


z 


W 


A K 


L 


H 


G 


L 


V 


L 


Q 


D 


N 


z 


X 


I 


H W Z 


H 


Q 


X 


B M 


Y 


J 


H 


Y 


W 


Y 


U 


F 


A 


Q 


z 


0 


J X Q 


J 


U 


z 


L P 


C 


K 


J 


C 


X 


C 


E 


G 


B 


U 


Q 


N 


K Z U 


K 


E 


Q 


Y R 


D 


H 


K 


D 


z 


D 


S 


H 


L 


E 


U 


A 


M Q E 


M 


S 


u 


C V 


F 


P 


M 


F 


Q 


F 


T 


J 


Y 


S 


E 


B 


PUS 


P 


T 


E 


D W 


G 


R 


P 


G 


u 


G 


I 


K 


C 


T 


S 


*L 


RET 


R 


I 


S 


F X 


H 


V 


R 


H 


E 


H 


0 


M 


D 


I 


T 


Y 


V S I 


V 


0 


T 


G Z 


J 


W 


V 


J 


S 


J 


N 


P 


F 


0 


I 


C 


WTO 


W 


N 


I 


H Q 


K 


X 


W 


K 


T 


K 


*A 


R 


G 


N 


0 


D 


X I N 


X 


A 


0 


J U 


M 


z 


X 


M 


I 


M 


B 


V 


H 


A 


N 


F 


Z 0 A 


z 


B 


N 


K E 


P 


Q 


z 


P 


0 


P 


L 


W 


J 


B 


A 


G 


Q N B 


Q 

Fiovbx 48 


*L 


A 


u s 


R 


u 


Q 


R 


N 


R 



(2) The selected generatrices (those marked by asterisks in Fig. 48) are assembled in 
columnar manner: 

ALLA 

R R A N 

G E H E 
N T S F 
0 R R E 



VlOVBS 40. 




REF ID : A64646 



89 



(3) The key letters are sought and give the ke3rword SOUP. The plain text for the second 
message is now known, and by reference to the cipher text and the primary components, the 
keyword for this message is found to be TIME. The complete texts are as follows: 



SOUP 
ALLA 
y H Y E 
R R A N 
X U B U 
G E M E 
K A P V 
N T S F 
L L T A 
0 R R E 
B U V V 
LIEF 
D Y S A 
0 F y 0 
B P C Q 
U R 0 R 
TUNG 
G A N I 
K F A Z 
Z A T I 
E F I Z 

0 N H A 
B D J E 
V E B E 
Z A L V 
E N S U 

1 D T R 
S P E N 
0 Q S U 
D E D X 
H A F K 



TIME 
ALLA 
C G S L 
R R A N 
Z Q U B 
G E H E 
M N C T 
N T S F 

Y B V H 
0 R R E 
L Q F T 
LIEF 
F L R H 

0 F Y 0 
L M T A 
U R 0 R 

1 Q Z ff 
G A N I 
M D Q N 
Z A T I 
SOWN 
0 N H A 
L C B L 

V E B E 
Q N E T 
E N S U 
0 C V S 
S P E N 
N Z R B 
D E D X 
J N 0 Q 



FlointB SO. 



45. The case of identical messages enciphered by keywords of different lengths. — a. In the 
foregoing case the keywords for the two messages, although different, were identical in length. 
When this is not true and the keywords are of different lengths, the procedure need be only 
slightly modified. 
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b. Given the following two cr 3 rptogramB suspected of containing the same plain-text en- 
ciphered by the same primary components but with different ke 3 rwords of different lengths, solve 
the messages. 

Messaqe No. 1 



V M 


Y 


Z 


G 


E 


A 


U 


N 


T 


P 


K F A Y 


J 


I Z 


H 


B 


u M y 


K 


B 


V 


F 


I 


V 


V 


S E 


0 


A 


F 


S 


K 


X 


K 


R 


Y 


W C A C 


Z 


0 R 


D 


0 


z 


R 


D 


E 


F 


B 


L 


K 


F 


E 


S M 


K 


S 


F 


A 


F 


E 


K 


V 


Q 


U R C M 


Y 


Z V 


0 


X 


V 


A 


B 


T 


A 


Y 


y 


U 


0 


A 


y T 


D 


K 


F 


E 


N 


W 


N 


T 


D 


B Q K U 


L 


A J 


L 


z 


I 


0 


U M 


A 


B 


0 


A 


F 


S 


K X 


Q 


P 


U 


Y 


M 


J 


P 


W 


Q 


T D B T 


0 


S I 


Y 


s 


M 


I 


Y 


K 


U 


R 


0 


G 


M 


W 


C T 


M 


Z 


Z 


V 


M 


V 


A 


J 






















































Message No. 2 


























Z G 


A 


M 


W 


I 


0 


M 


0 


A 


C 


0 D H A 


C 


L R 


L 


p 


M 


0 Q 


0 


J 


E 


M 


0 


Q U 


D H 


X 


B 


Y 


U Q 


M 


G 


A 


U 


V G L Q 


D 


B S 


P 


u 


0 


A 


B 


I 


R 


P 


W 


X 


Y 


M 


0 G 


G 


F 


T 


M 


R 


H 


V 


F 


G 


W K N I 


V A U 


P 


F 


A 


B 


R 


V 


I 


L 


A 


Q 


E 


M 


Z D 


J 


X 


Y 


M 


E 


D 


D 


y 


B 


0 S V M 


P 


N L 


G 


X 


X 


D 


Y 


D 


0 


P 


X 


B 


Y 


U 


Q M 


N 


K 


Y 


F 


L 


U 


Y 


Y 


G 


V P V R 


D 


N C 


Z 


E 


K 


J 


Q 0 


R 


W 


J 


X 


R 


V 


G D 


K 


D 


S 


X 


C 


E 


E 


C 



































c. The messages are long enough to show a few short repetitions which permit factoring. 
The latter discloses that Message 1 has a period of 4 and Message 2, a period of 6 letters. The 
messages are superimposed, with numbers marking the position of each letter in the corresponding 
period, as shown below: 





1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


No. 1. 


V 


M 


Y 


z 


G 


E 


A 


u 


N 


T 


P 


K 


F 


A 


y 


J 


I 


z 


M 


B 


U 


M 


Y 


K 


No. 2. 


z 


G 


A 


H 


W 


I 


0 


M 


0 


A 


c 


0 


D 


H 


A 


c 


L 


R 


L 


P 


M 


0 


Q 


0 




1 


2 


3 


4 


s 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 




1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


No. 1. 


B 


V 


F 


I 


V 


V 


s 


E 


0 


A 


F 


s 


K 


X 


K 


R 


Y 


ff 


c 


A 


c 


Z 


0 


R 


No. 2. 


J 


E 


M 


0 


Q 


u 


D 


H 


X 


B 


Y 


u 


Q 


M 


G 


A 


U 


V 


G 


L 


Q 


D 


B 


S 




1 


2 


3 


4 


8 


« 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


6 


8 




1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


• 

4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


No. 1. 


D 


0 


Z 


R 


D 


E 


F 


B 


L 


K 


F 


E 


s 


M 


K 


s 


F 


A 


F 


E 


K 


V 


Q 


U 


No. 2. 


P 


U 


0 


A 


B 


I 


R 


P 


W 


X 


y 


M 


0 


G 


G 


F 


T 


M 


R 


H 


V 


F 


G 


W 




1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


5 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


6 




1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


No. 1. 


R 


C 


M 


Y 


z 


V 


0 


X V 


A 


B 


T 


A 


Y 


Y 


u 


0 


A 


Y 


T 


D 


K 


F 


E 


No. 2. 


K 


N 


I 


V 


A 


u 


p 


F 


A 


B 


R 


V 


I 


L 


A 


Q 


E 


M 


z 


D 


J 


X 


Y 


M 




1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


6 


1 


2 


3 


4 


6 


8 




1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


No. 1. 


N 


W 


N 


T 


D 


B 


Q 


K U 


L 


A 


J 


L 


z 


I 


0 


u 


M 


A 


B 


0 


A 


F 


s 


No. 2. 


E 


D 


D 


y 


B 


0 


S 


V 


M 


p 


N 


L 


G 


X 


X 


D 


Y 


D 


0 


P 


X 


B 


Y 


u 




1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


6 


8 




1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


No. 1. 


K 


X 


Q 


P 


U 


Y 


M 


J 


P 


W 


Q 


T 


D 


B 


T 


0 


s 


I 


Y 


s 


M 


I 


Y 


K 


No. 2. 


Q 


M 


N 


K 


y 


F 


L 


u 


Y 


Y 


G 


V 


P 


V 


R 


D 


N 


c 


Z 


E 


K 


J 


Q 


0 




1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 




1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


1 


2 


3 


4 


















No. 1. 


U 


R 


0 


G 


M 


W 


c 


T 


M 


Z 


z 


V 


M 


V 


A 


J 


















No. 2. 


R 


W 


J 


X 


R 


V 


6 


D 


K 


D 


s 


X 


C 


E 


E 


c 




















1 


2 


3 


4 


8 


8 


1 


2 


3 


4 


8 


8 


1 


2 


3 


4 
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d. A reconstruction skeleton of “secondary alphabets” is now made by distributing the 
letters in respective lines corresponding to the 12 different superimposed pairs of numbers. For 
example, all pairs corresponding to the superimposition of position 1 of Message 1 with position 1 
of Message 2 are distributed in lines 0 and 1 of the skeleton. Thus, the very first superimposed 



pair IS 



the letter Z is inserted in line 1 under the letter V. The next { { pair is the 13th super- 

1 

imposition, with | q ; the letter D is inserted in line 1 under the letter F, and so on. The skeleton 
is then as follows: 



0 


A 


B 


C 


D 


E 


F 


G 


□ 


I 


J 


K 


L 




N 


0 


P 


Q 


R 


S 


T 


U 


V 


w 


X 


Y 


Z 


fBSM 


I 


J 




P 




D 










Q 


G 


□ 


E 








K 


0 




R 


z 












H 


V 


N 




















G 




U 






W 








E 


D 


M 


L 


X 


3-3 


E 










M 






X 




G 




I 


D 


J 




N 






R 










A 


0 


4-4 














X 




0 


C 










D 


K 




A 


F 


Y 


Q 








V 




1-5 








B 




T 


W 




L 








R 




E 








N 




Y 


Q 






U 




2-6 


M 


0 






I 








C 








D 


















U 


V 




F 


n 


3-1 


0 




G 






R 














L 




P 




S 




D 












Z 




4-2 


L 


P 






H 










U 


V 
















E 


D 


M 






F 






1-3 






Q 


J 














V 


W 


K 


0 


X 


Y 










M 


A 










2-4 


B 
















J 




X 


P 


0 














A 




F 


Y 






D 


3-5 


N 


R 








Y 


















B 


C 


G 
















Q 


S 


4-6 










M 










L 


0 














_S_ 




V 


W 


X 











Fiquhji 51. 



e. There are more than sufficient data here to permit of the reconstruction of a complete 
equivalent primary component, for example, the following: 

1 2 3 4 5 6 7 8 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 

ITKNPZHMWBQEULFCSJAXRGDVOY 

J. The subsequent steps in the actual decipherment of the text of either of the two messages 
are of considerable interest. Thus far the cryptanalyst has only the cipher component of the 
primary sliding components. The plain component may be identical with the cipher com- 
ponent and may progress in the same direction, or in the reverse direction; or, the two com- 
ponents may be different. If different, the plain component may be the normal sequence, 
direct or reversed. Tests must be made to ascertain which of these various possibilities is true. 

g. (1) It will first be assumed that the primary plain component is the normal direct 
sequence. Applying the procedure outlined in Par. 23 to the message with the shorter key 
(Message No. 1, to ^ve the most data per secondary alphabet), an attempt is made to solve 
the message. It is unnecessary here to go further into detail in this procedure; suffice it to 
indicate that the attempt is unsuccessful and it follows that the plain component is not the 
normal direct sequence. A normal reversed sequence is then assumed for the plain component 
and the proper procedure applied. Again the attempt is found useless. Next, it is assumed 
that the plain component is identical with the cipher component, and the procediu’e outlined in 
Far. 37 is tried. This also is unsuccessful. Another attempt, assuming the plain component 
runs in the reverse direction, is likewise unsuccessful. There remains one last hypothesis, viz, 
that the two primary components are different mixed sequences. 
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(2) Here is Message No. 1 transcribed in periods of four letters. Unditeral frequency 
distributions for the four secondary alphabets are shown below in Fig. 52, labeled la, 2a, 3a, 
and 4a. These distributions are based upon the normal sequence A to Z. But since the recon- 
structed cipher component is at hand these distributions can be rearranged according to the 
sequence of the cipher component, as shown in distributions labeled 16, 26, 36, and 46 in Fig. 52. 
The latter distributions may be combined by shifting distributions £b, Sb, and 4b to proper super- 
impositions with respect to lb so as to yield a single monoalphabetic distribviionfor the entire message. 
In other words, the polyalphabetic message can be converted into monoalphabetic terms, thus very 
considerably simplifying the solution. 



Mebsaob No. 1 
VMYZ VABT 
GEAU AYYU 
NTPK OAYT 
FAYJ DKFE 
IZMB NWNT 
UHYK DBQK 
BVFI ULAJ 
VVSE LZIO 
OAFS UMAB 
KXKR OAFS 
YWCA KXQP 
CZOR UYMJ 
DOZR PWQT 
DEFB DBTO 
LKFE SIYS 
SMKS MIYK 
FAFE UROG 
KVQU MWCT 
RCMY MZZV 
ZVOX MVAJ 



lo. ABCDEFGHIJKLMNOPQRSTUVWXYZ 



2a. ABCDEFGHIJKLMNOPQRSTUVWXYZ 



3a. ABCDEFGHIJKLMNOPQRSTUVWXYZ 



4a. aIcdefgh?Iklmn5?qr1tS^wxyz 



16. ITKNPZHMWBQEULFCSJAXRGDVOY 



26. ITKNPZHMWBQEULFCSJAXRGDVOY 



36. ITKNPZHMWBQEULFCSJAXRGDVOY 



46. iTKNPZHMWBQEULFCSJAXRGDVOY 



FIOVBI 52. 
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(3) Note in Fig, 53 how the four distributions axe shifted for superunposition and how the 
combined distribution presents the characteristics of a typical monoalphabetic distribution. 

16. itknpzhmwbqeulfcsjaxrSdvoy 

26. luLFCSJAXRGDVOYITKNPZHMWBQ 

^ ^ 

36. KNPZHMWBQEULFCSJAXRGDVOYIT 

46. pzhmwbqIulfcsjaxrgdvo$It^n 



16-46 ^ ^ = 

combined ITKNPZHMWBQEULfBsJAXRGDVOY 

FiGunE S3. 



(4) The letters belonging to alphabets 2, 3, and 4 of Fig. 52 may now be transcribed in t^ms 
of alphabet 1. That is, the two E’s of alphabet 2 become I’s; the L of alphabet 2 becomes a K; 
the G becomes a P, and so on. I^ikewise, the two K*s of alphabet 3 become I’s, the N becomes 
a T, and so on. The entire message is then a monoalphabet and can readily be solved. It is as 
follows: 



V 


D 


V 


T 


G 


I 


S 


W 


N 


S 


K 


0 


F 


M 


V 


L 


I 


R 


Z 


Z 


u 


D 


V 


0 


B 


U 


U 


D 


V 


U 


E 


N 


E 


M 


Y 


H 


A 


S 


C 


A 


P 


T 


U 


R 


E 


D 


H 


I 


L 


L 


0 


N 


E 


T 


W 


0 


0 


N 


E 


0 


F 


M 


0 


M 


U 


U 


K 


W 


I 


S 


Y 


V 


L 


F 


C 


R 


D 


S 


D 


L 


N 


S 


D 


I 


U 


Z 


L 


J 


U 


H 


U 


R 


T 


R 


0 


0 


P 


S 


H 


A 


V 


E 


r 


U 


G 


I 


N 


A 


N 


D 


C 


A 


N 


H 


0 


L 


D 


F 


0 


R 


S 


D 


I 


U 


F 


M 


U 


M 


K 


U 


W 


W 


R 


P 


Z 


G 


Z 


U 


D 


C 


V 


M 


M 


V 


A 


F 


V 


W 


0 


H 


A 


N 


H 


0 


U 


R 


0 


R 


P 


0 


S 


S 


I 


B 


L 


Y 


L 


0 


N 


G 


E 


R 


R 


E 


Q 


U 


E 


S 


T 


R 


V 


V 


D 


J 


U 


M 


N 


V 


T 


V 


D 


0 


w 


0 


U 


K 


S 


L 


L 


R 


0 


R 


U 


D 


S 


Z 


0 


H 


U 


U 


E 


I 


N 


F 


0 


R 


C 


E 


M 


E 


N 


T 


s 


T 


0 


P 


A 


D 


D 


I 


T 


I 


0 


N 


A 


L 


T 


R 


0 


0 


K 


W 


W 


I 


U 


F 


Z 


L 


P 


V 


W 


V 


D 


0 


Y 


R 


S 


C 


V 


U 


M 


C 


V 


0 


U 


B 


D 


J 


H 


V 


P 


S 


S 


H 


0 


U 


L 


D 


B 


E 


S 


E 


N 


T 


V 


I 


A 


G 


E 


0 


R 


G 


E 


T 


0 


W 


N 


F 


R 


E 


L 


V 


M 


R 


N 


X 


M 


U 


S 


L 










































D 


E 


R 


I 


C 


K 


R 


0 


A 


D 











































(5) Having the plain text, the derivation of the cipher component (an equivalent) is an 
easy matter. It is merely necessary to base the reconstruction upon any of the secondary alpha- 
bets, since the plain text — cipher relationship is now known directly, and the primary cipher 
component is at hand. The primaiy plain component is found to bo as follows: 

1 2 3 4 S 8 7 8 9 10 11 12 13 14 IS 16 17 18 19 20 21 22 23 24 25 28 

HMPCBL.RSW. .ODUGAFQKIYNETV 

(6) The keywords for both messages can now be found, if desirable, by finding the equivalent 
of Ap in each of the secondary alphabets of the original polyalphabetic messages. The keyword 
for No. 1 is STAR; that for No. 2 is OCEANS. 

162018—38 7 
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(7) The student may, if he wishes, try to find out whether the primary components recon- 
structed above are the original components or are equivalent components, by examining aU the 
possible decimations of the two components for evidences of derivation from keywords. 

h. As already stated in. Par. 26f, there are certain statistical and mathematical tests that 
can be employed in the process of “matching” distributions to ascertain proper superimpositions 
for monoalphabeticity. In the case just considered there were sufficient data in the distributions 
to permit the process to be applied successfully by eye, without necessitating statistical tests. 

i. This case is an excellent illustration of the application of the process of converting a 
polyalphabetie cipher into monoalphabetic terms. Because it is a very valuable and important 
cryptanalytic "trick,” the student should study it most carefully in order to gain a good under- 
standing of the principle upon which it is based and its significance in cryptanalysis. The 
conversion in the case under discussion was possible because the sequence of letters forming the 
cipher component had been reconstructed and was known, and therefore the uniliteral dis- 
tributions for the respective secondary cipher alphabets could theoretically be shifted to correct 
superimpositions for monoalphabeticity. It also happened that there were sufficient data in 
the distributions to give proper indications for their relative displacements. Therefore, the 
theoretical possibility in this case became an actuality. Without these two necessary conditions 
the superimposition and conversion cannot be accomplished. The student should always be 
on the lookout for situations in which this is possible. 

46. Concluding remarks. — a. The observant student will have noted that a large part of 
this text is devoted to the elucidation and application of a very few basic principles. These 
principles are, however, extremely important and their proper usage in the hands of a skilled 
cryptanalyst makes them practically indispensable tools of his art. The student should therefore 
drill himself in the application of these tools by having someone make up problem after problem 
for him to practice upon, \mtil he acquires facility in their use and feels competent to apply 
them in practice whenever the least opportunity presents itself. This will save him much time 
and effort in the solution of bona fide messages. 

6. Continuing the analytical key introduced in Military Cryptanalysis Fart I, the outline 
for the studies covered by Part II follows herewith. 



) 
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Analytical Key for Military Cryptanalysis, Part II * 

(Numbers In parentheses refer to Paragraph Numbers In this text) 




*Foi expUuatlou of the lua of thij chart see Par. 80 of Military Cryptanalysis, Part I. 

( 95 ) 
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APPENDIX 1 

The 12 Types of Cipheh Squares 
(See Paragraph 7) 

Table I-B.’ 

Components; 

(1) ABCDEFGHIJKLMNOPQRSTUVWXYZ 

(2) FBPYRCQZIGSEHTDJUMKVALWN0X 
Enciphering equations; en/j=0i/i; 0p/i=0c/2 (0i/i is A). 



PLAIN TEXT 





A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


p 


Q 


R 


S 


T 


U 


V 


W 


X 


Y 


Z 


A 


A 


L 


W 


□ 


0 


X 


F 


B 


p 


Y 


R 


C 


Q 


Z 


Q 


G 


s 


E 


H 


T 


D 


J 


U 


M 


K 


V 


B 


B 


P 


Y 


□ 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


Q 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


C 


C 


Q 


Z 


I 


G 


s 


E 


H 


T 


D 


J 


U 


m 


□ 


D 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


D 


D 


J 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


p 


Y 


R 


C 


Q 


Z 


I 


G 


S 


B 


H 


T 


E 


E 


H 


T 


D 


J 


u 


M 


K 


V 


A 


L 


w 




0 


□ 


F 


B 


P 


Y 


R 


c 


Q 


Z 


I 


G 


S 


F 


F 


6 


P 


Y 


R 


c 


Q 


Z 


I 


G 


S 


E 


E3 


T 


D 


J 


U 


M 


K 


V 


A 


L 


w 


N 


0 


X 


G 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


n 


W 




0 


X 


m 


B 


P 


Y 


R 


c 


Q 


Z 


I 


H 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


!7 


N 


0 


X 


Q 


B 


P 


□ 


R 


C 


Q 


Z 


I 


G 


S 


E 


I 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


Q 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


J 


J 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


c 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


K 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


L 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


c 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


M 


U 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Y 


R 


Q 


Q 


Z 


I 


G 


S 


E 


H 


T 


D 


J 


U 


H •• 
W N 


N 


0 


X 


F 


B 


P 


Y 


R 


c 


Q 


Z 


I 


G 


m 


Q 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


W 


0 


0 


X 


F 


B 


P 


Y 


R 


C 


Q 


z 


I 


G 


m 


E 


Q 


T 


D 


J 


U 


M 


K 


V 


A 


L 


W 


N 


P 


P 


Y 


R 


C 


Q 


Z 


I 


G 


s 


m 


H 


T 


D 


J 


Q 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


Q 


Q 


Z 


I 


Q 


S 


E 


H 


T 


D 


J 


U 


M 


Q 


V 


A 


L 


W 


N 


0 


X 


F 


6 


P 


Y 


R 


C 


R 


R 


C 


Q 


Z 


I 


G 


S 


E 


H 


ra 


D 


J 


ra 


M 


□ 


V 


A 


L 


wm 


N 


0 


X 


F 


B 


P 


Y 


S 


S' 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 




N 


0 


X 


F 


B 


p 


Y 


R 


c 


Q 


Z 


I 


G 


T 


T 


D 


J 


U 


M 


K 


V 


A 


L 


w 


N 


0 


X 


F 


□ 


Q 


Y 


R 


c 


Q 


Z 


I 


G 


S 


E 


H 


U 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


p 


Y 


R 


c 


Q 




I 


G 


S 


E 


H 


T 


D 


J 


V 


V 


A 


L 


W 


N 


0 


X 


F 


B 


p 


Y 


R 


□ 


Q 


Z 


I 


G 


El 


E 


H 


T 


D 


J 


U 


M 


K 


W 


W 


N 


0 


X 


F 


B 


p 


Y 


R 


c 


Q 


Z 


1 


G 


S 


E 


H 


T 


D 


J 


U 


M 


K 


y 


A 


L 


X 


X 


F 


B 


P 


Y 


R 


Q 


Q 


Z 


I 


Q 


S 


E 


H 


T 


D 


J 


U 


M 


K 


V 


A 


L 


w 


N 


0 


Y 


Y 


R 


C 






I 


§ 


S 


E 


H 


Q 




J 


U 


M 


K 


V 


A 


L 


W 


N 


0 


X 


F 


B 


P 


Z 


Z 


_I_ 


G 


S 




El 


D 


El 


J_ 


U 


o 


Q 


V 


D 


a 


o 


N 


0 


X 


F 


B 


P 


Y 


m 


C 


Q 



‘ This table is labeled “Table 1-B” because it is the same as Table 1-A on page 7, except that the horizontal 
lines of tlie latter have been shifted so as to begin the successive alphabets with the successive letters of the normal 
sequence. 
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Components: 

(1) ABCDEFGH 

(2) FBPYRCQZ 
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Table III 



JKLMNOPQRSTUVWXYZ 

GSEHTDJUMKVALWNOX 



Enciphering equations: 0k/i=0i/»; Op/i=Oc /2 (© 1/2 is F). 



PLAIN TEXT 

ABCDEFGHIJKLMNOPQRSTUVWXYZ 



;!l: 

I 



fi 
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Table IV 

Components: 

(1) ABCDEFGHIJKLMNOPQRSTUVWXYZ 

(2) FBPYRCQZIGSEHTDJUMKVALWN0X 
Enciphering equations; 0»/i=0i/a; 0p/8=0B/i (0|/jisF). 



PLAIN TEXT 





A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


S 


T 


U 


V 


W 


X 


Y 


Z 


A 


s 


B 




0 


L 


A 


J 


M 


I 


P 


S 


V 


R 


X 


Y 


c 


G 


E 


K 


N 


Q 


T 


Q 


z 


Q 


lai 


B 


□ 


C 




P 


M 


B 


K 


N 


J 


Q 


T 


W 


S 


Y 


Z 


D 


H 


F 


L 


0 


R 


U 


n 


A 




I 


C 




D 


H 


Q 


N 


C 


L 


0 


K 


R 


U 


X 


T 


Z 


A 


E 


I 


G 


M 


P 


S 


V 


Q 


B 




J 


D 




E 


I 


R 


0 


D 


M 


P 


L 


S 


V 


Y 


U 


A 


B 


F 


J 


H 


N 


Q 


T 


W 


z 


C 


G 


K 


E 




F 


D 


B 


P 


E 


N 


Q 


M 


T 


W 


z 


V 


B 


C 


G 


K 


I 


0 


R 


U 


X 


A 


D 


H 


L 


F 




G 


K 


T 


Q 


F 


0 


R 


N 


U 


X 


A 


W 


C 


D 


H 


L 


J 


P 


S 


V 


Y 


B 


E 


I 


M 


G 




H 


L 


U 


R 


G 


P 


S 


0 


V 


Y 


B 


X 


D 


E 


I 


M 


K 


Q 


T 


W 


Z 


C 


F 


J 


N 


H 




D 


M 


V 


S 


H 


Q 


T 


P 


W 


Z 


Q 


Y 


E 


F 


J 


N 


L 


R 


U 


X 


A 


D 




K 


0 


I 




J 


N 


W 


T 


I 


R 


U 


Q 


X 


A 


D 


Z 


F 


G 


K 


0 


M 


S 


V 


Y 


B 


E 




L 


P 


J 


D 


K 


0 


X 


U 


J 


S 


V 


R 


Y 


B 


E 


A 


G 


H 


L 


P 


N 


T 


W 


Z 


C 


F 


I 


M 


n 


K 


U 


L 


P 


Y 


V 


K 


T 


ff 


S 


Z 


C 


F 


B 


H 


I 


H 


Q 


0 


U 


X 


A 


D 




□ 


N 


n 




Q 


M 


Q 


Z 


u 


L 


U 


X 


T 
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Table VI 

Components: 

(1) ABCDEFGHIJKLMNOPQRSTUVWXYZ 

(2) FBPYRCQZIGSEHTDJUMKVALWN0X 
Enciphering equations: 0k/2=0e/i; 0i/i=0pn (0i/i is A). 



PLAIN TEXT 
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Table VII 

Components: 

(1) — ABCDEPGHIJKLMNOPQRSTUVWXYZ 

(2) — FBPYRCQZIGSEHTDJUMKVALWNOX 
Enciphering equations: 0k/»=©p/i; 0i/*=0e/i (0i/a is F)- 
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Table VIII 

Componets: 

(1) ABCDEFGHIJKLMNOPQRSTUVWXYZ 

(2) FBPYRCQZIGSEHTDJUMKVALWN0X 

Enciphering equations: 0K/a=Oe/!; 0i/a=©p/i (© 1/2 is F). 
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Table IX * 

Components; 

(1) ABCDEFGHIJKLMNOPQRSTUVWXYZ 

( 2 ) FBPYRCQZIGSEHTDJUMKVALWN0X 
Enciphering equations: 0k/i=6|>/2; 0i/i=0e/* (©i/j is A). 
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B 


I 



• An interesting fact about this case is that if the plain component is made identical with the cipher com- 
ponent (both being the sequence FBPY ), and if the enciphering equations are the same as for Table 1-B, 

then the resultant cipher square is identical with Table IX, except that the key letters at the left are in the 
order of the reversed mixed component, FXON .... In other words, the secondary cipher alphabets produced 
by the interaction of two identical mixed components we the same as those given by the interaction of a 
mixed component and the normal component. 
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Table X’ 

Components: 

(1) ABCDEFGHIJKLMNOPQRSTUVWXYZ 

(2) PBPYRCQZIGSEHTDJUMKVALWN0X 

Enciphering equations: 0k/i=0e/a; Qin=%a (6i/i is A). 



PLAIN TEXT 

ABCDEFGHIJKLMNOPQRSTUVWXYZ 




* Footnote 2 to Table IX, page 104, also applies to this table, except that the key letters at the left will 
follow the order of the direct mixed component. 
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Table XII 

Components: 

(1) ABCDEFGHIJKLMNOPQRSTUVWXYZ 

(2) FBPYRCQZIGSEHTDJUMKVALWN0X 

Enciphering equations: Qk/i=Oe/i', 0i/j=6pn ( 61/2 is F). 



PLAIN TEXT 







A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


M 


N 


0 


P 


Q 


R 


s 


T 


u 


V 


w 


X 


Y 


Z 




A 


F 


X 


0 


N 


W 


L 


A 


V 


K 


M 


U 


J 


D 


T 


H 


E 


B 


B 


□ 


Z 


B 


s 


R 


Y 


P 


B 


¥ 


B 


B 


F 


X 


0 


N 


W 


L 


A 


V 


K 


M 


U 


J 


D 


T 


H 


E 


B 


B 


I 


z 


B 


C 


R 


Y 


P 


'k 


C 


P 


B 


F 


X 
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N 
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L 
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K 


M 


U 


J 


D 


T 


H 


B 


s 


B 
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X 
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V 
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I 
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Q 
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R 
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F 


X 


0 


N 


W 
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D 


T 
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Z 


Q 


C 
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F 


X 
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A 


V 


K 
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U 
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D 
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N 
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APPENDIX 2* 

Elementary Statistical Theory Applicable to the Phenomena op Repetition 

IN Cryptanalysis 

1. Introductory. — a. In Par. 9c it was stated that the phenomena of repetition in crypt- 
analytics may be removed from the realm of intuition and dealt with statistically. The dis- 
cussion of the matter will here be confined to relatively simple phases of the theory of probability, 
a definition of which implies philosophical questions of no practical interest to the student of 
cryptanalj’^sis. For his purposes, the following definition of a 'priori probability will be sufficient: 

The probability that an event will occur is the ratio of the number of “fav- 
orable cases” to the number of total possible cases, all cases being equally 
likely to occur. By a “favorable case” is meant one which will produce the 
event in question. 

b. In what follows, reference will be maile to rarulo'm assortments of letters and especially to 
random text. By the latter will be meant merely that the text under consideration has been as- 
sumed to have been enciphered by some more or loss complex cryptographic system so that for 
all practical purposes the sequence of letters constituting this text is a random assortment; that 
is, the sequence is just about what would have been obtained if the letters had been drawn at 
random out of a box containing a large number of the 26 letters of the alphabet, aU in equal 
proportions, so that there ai-e exactly the same numbers of A’s, B’s, C’s, . . . Z’s. It is assumed 
that each time in making a drawing from'such a box, the latter is thoroughly shaken so that the 
letters are thoroughly mixed and then a single letter is selected 'at random, recorded, and 
replaced in the same box. In what follows, the word “box” will refer to the box as described. 

c. A unUiteral frequency distribution of a large volume of random text will be “flat,” 
i. e., lacking crests and troughs. 

d. For purposes of statistical analysis, the text of a monoalphabetic substitution cipher is 
equivalent to plain text. As a corollary; when a polyalphabetic substitution cipher has been 
reduced to the simple terms of a set of monoalphabets, i. e., when the letters constituting the 
cipher text have been allocated into their proper uniliteral distributions, the letters falling into 
the respective distributions are statistically equivalent to plain text. 

2. Data pertaining to single letters, — a. (1) A single letter will be drawn at random from 
the box. What is the probability that it \vill be an A? According to the foregoing definition of 
probability, since the total number of possible cases is 26 and the number of favorable cases is 

here only 1, the probability is 1 : 26 =^=.0385. This is the probability of drawing an . A from 

the box. The probability that the letter drawn will be a B, a C, a D, . . ., a Z is the same as for A. 
In other words, the probability of drawing ony specified single letter is p= .0385. 

(2) The value p=.0385, as found above, may also be termed the probability constant for 
single letters in random text of a 26-letter alphabet. For any language this constant is merely 
the reciprocal of the total number of different characters which may be employed in writing the 
text in question. 

> In the preparation of this appendix, the author has had the benefit of the very helpful suggestions of 
Capt. H. G. Miller, Signal Corps, Mr. F. B. Rowlett, Dr. S. Kullback, and Dr. A. Sinkov, Assistant Cryptanalysts, 
O. C. Sig. O. Certain parts of Dr. Kullback’s important paper “Statistical Metliods in Cryptanalysis” form 
the basis of the discussion. 
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(3) Another way of interpreting the notation p=.0385 is to say that in a large volume of 
random text, for example in 100,000 letters, any letter that one may choose to specify may be 
expected to occur about 3,850 times; in 10,000 letters it may be expected to occur about 385 
times; in 1,000 letters, about 38.5 times, and so on. In every-day language it would be said 
that “in the long run” or “on the average” in 1,000 letters of random text there will be about 
38.5 occurrences of each of the 26 letters of the alphabet. 

(4) But unfortunately, in cryptanalysis it is not often the case that one has such a large 
number of letters available for study in any single cipher alphabet. More often the cryptanalyst 
has a relatively small number of letters and these must be distributed over several cipher 
alphabets. Hence it is necessary to be able to deal with smaller numbers of letters. Consider 
a specific piece of random text of only 100 letters. It has been seen that “in the long run” 
each letter may be expected to occur about 3.85 times in this amount of random text; that is, 
the 26 letters will have an average frequency of 3.85. But in reaching this average of 3.85 
occurrences in 100 letters, it is obvious that some letter or letters may not appear at all, some 
may appear once, some twice, and so on. How many will not appear at all; how many will 
appear 1, 2, 3, . . . tipies? In other words, how will the different categories of letters (differ- 
ent in respect to frequency of occurrence) be dbtributed, or what will the distribution be like? 
Will it follow any kind of law or pattern? The cryptanalyst also wants to know the answer 
to questions such as these: What is the probability that a specified letter will not appear at 
all in a given piece of text? That it will appear exactly 1, 2, 3, . . . times? That it wUl appear 
ai least 1, 2, 3, . . . times? The same sort of questions may be asked with respect to digraphs, 
trigraphs, and so on. 

h. (1) It may be stated at Once that questions of this nature are not easily answered, and 
a complete discussion falls quite outside the scope of this text. However, it will be sufficient 
for the present purposes if the student is provided with a more or less simple and practical means 
of finding the answers. With this in view certain curves have been prepared from data based 
upon Poisson’s exponential expansion, or the “law of small probabilities” and their use will 
now be explained. Students without a knowledge of the mathematical theory of probability 
and statistics will have to take the curves “on faith” Those interested in their derivation are 
referred to the following texts: 

Fisher, R. A., Statistical Methods jor Research Workers, London, 1937. 

Fry, T. C., Probability and Its Engineering Uses, New York, 1928. 

(2) By means of these probability curves, it is possible to find, in a relatively easy manner, 
the probabihty for 0, 1, 2, ... 11 occurrences of an event in » cases, if the mean (expected, 
average, probable) number of occurrences in these n cas^ is known. . For example, given a cryp- 
togram equivalent to 100 letters of random text, what is the probability that any specified single 
letter, whatever will not appear at all in the cryptogram? Since the probability of the occurrence 

of a specified single letter is ^=.0385, and there are 100 letters in the cryptogram, the average 

or expected or mean number of occurrences of an A, a B, a C, . . ., is .0385X100=3.85. Refer 
now to that probability curve which is marked “/o”, meaning “frequency zero”, or “zero occur- 
rences.” On the horizontal or x axis of that curve find the point corresponding to the value 
3.85 and follow the vertical coordinate determined by this value up to the point of intersection 
with the curve itself; then follow the horizontal coordinate determined by this intersection point 
over to the left and read the value on the vertical axis of the curve. It is approximately .021. 
This means that the probability that a specified single letter (an A, a B, a C, . . .) will not appear 
at all in the cryptogram, if it reaUy were a perfectly random aggortment of 100 letters, is .021. 
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That is, according to the theory of probability, in 1,000 cases of random-text messages of 100 
letters each, one may expect to find about 21 messages in which a specified single letter will not 
appear at all. Another way of saying the same thing is: If 1,000 sets of 100 letters of random 
text are examined, in about 21 out of the 1,000 such sets any letter that one may choose to 
name will be absent. This, of course, is merely a theoretical expectancy; it indicates Only 
what probably will happen in the long run. 

(3) What is the probability that a specified single letter will appear exactly once in 100 
letters of random text? To answer this question, find on the curve marked /i, the point of 
intersection of the verticol coordinate corresponding to the mean or average value 3,8S with 
the curve; follow the horizontal coordinate thus determined over to the vertical scale at the 
left; read the value on this scale. It is .082, which means that in 1,000 cases of random-text 
messages of 100 letters each, one may expect to find about 82 messages in which any letter 
one chooses to specify will occur exactly once, no more and no less. 

(4) In the same way, the probability that a specified single letter will appear emctly twice 
is found to be .158; exactly 3 times, .202; and so on, as shown in the table below: 



100 htten of random text 



rnquency 

(X) 


Probability that 
a rpeoiaed siacila 
letter will occur 
exactly x times 


0 


0.021 


1 


. 082 


2 


. 158 


3 


.202 


4 


. 196 


5 


. 160 


0 


.096 


7 


. 063 


S 


.026 


9 


.011 


• 10 


.004 


11 


.001 



(5) To find the probability that a specified single letter will occur at least 1, 2, 8, . . . times 
in a series of letters constituting random text, one reasons as follows: Since the concept “at least 
1” implies that the number specified is to be considered only as the minimum, with no limit 
indicated as to maximum, occurrences of 2, 3, 4, . . . are also “favorable” cases; the probabilities 
for exactly 1, 2, 3, 4, . . . occurrences should therefore be added and this will give the probability 
for “at least 1.” Thus, in the case of 100 letters, the sum of the probabilities for exactly 1 to 11 
occurrences, as set forth in the table directly above, is .978, and the latter value approximatss 
the probability for at least 1 occurrence. 

(6) A more accurate result will be obtained by the following reasoning. The probability 



for zero occurrences is .021. Since it is certain that a specified letter will occur either zero times i 

or 1, 2, 3, . . . times, to find the probability for at least one time it is merely necessary to sub- ; 

tract the probability for zero occurrences from unity. That is, 1— .021 = .979, which is .001 | 

greater than the result obtained by the other method. The reason it is greater is that the value ^ 

.979 inclndes occurrences beyond 11, which were excluded from the previous caloulaticm. Of * 

course, the probabilities for those ppeun'ences beyond 1 1 are very small, but taken all together they 
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add up to .001, the difference between the results obtained by the two methods. The proba- 
bility for at least 2 occurrences is the difference between unity and the sum of the probability 
for zero and exactly 1 occurrences; that is, 1 — (Po+Pi)=l — (.021-1-.082)=1— .103=.897. The 
respective probabilities for various numbers of occurrences of a specified single letter (from 0 to 
11) are given in the following table: 



100 letlera of random text 



Frequency 

Cx) 


Probability that a 
apedfled sin^ 
letter will ooour 
eiactly x 
times 


FrobabUity that a 
apedfled slB(la 
letter wUl doout 
at leasts 
times 


0 


0.021 


1.000 


1 


.082 


.979 


2 


. 158 


.897 


3 


. 202 


.739 


4 


. 195 


.537 


6 


. 150 


.342 


6 


.096 


.192 


7 


.053 


.096 


8 


.026 


.043 


» 


.011 


.017 


10 


.004 


.006 


11 


.001 


.002 



(7) The foregoing calculations refer to random text composed of 100 letters. For other 
numbers of letters, it is merely necessary to find the mean (multiply the probability for drawing 

a specified single letter out of the box, which is Jq or .0385, by the number of letters in the 

assortment) and refer to the various curves, as before. For example, for a random assortment 
of 200 letters, the mean is 200 X .0385, or 7.7, and this is the value of the point to be sought along 
the horizontal or x axes of the curves; the intersections of the respective vertical lines correspond- 
ing to this mean with the various curves for 0, 1, 2, 3, . . . occurrences give the probabilities for 
these occurrences, the reading being taken on the vertical or y axes of the curves. 

(8) The discussion thus far has dealt with the probabilities for 0, 1, 2, 3, . . . occurrences 
of specified single letters. It may be of more practical advantage to the student if he could be 
shown how to find the answer to these questions: Given a random assortment of 100 letters 
how many letters may be expected to occur exactly 0, 1, 2, 3, . . . times? How many may be 
expected to occiur at least 1, 2, 3, . . . times? The curves may here again be used to answer 
these questions, by a very simple calculation: multiply the probability value as obtained above 
for a specified single letter by the number of different elements being considered. For example, 
the probability that a specified single letter will occur exactly twice in a perfectly random assort- 
ment of 100 letters is .158 ; since the number of different letters is 26, the absolute number of single 
letters that may be expected to occur exactly 2 times in this assortment is .158X26=4.108, 
That is, in 100 letters of randm text there should be about four letters which occur exactly 2 times. 
The following table gives the data for various numbers of occurrences. 
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100 letters of random text 



Frequency 

W 


Probability that a 
s|>ecifled single 
letter will occur 
exactly z 
times 


Probability that a 
spedflod single 
letter will occur 
at least X 
times 


Probable number 
of letters appear* 
ing exactly z 
times 


Probable number 
of letters appear- 
ing at least x 
times 




a 021 


1.000 


0. 546 


26. 000 




.082 


. 979 


2. 132 


25. 454 




. 158 


.897 


4. 108 


23. 322 




. 202 


.739 


5. 252 


19. 214 




. 195 


.537 


5. 070 


la 962 


5 


. 150 


. 342 


a 900 


a 892 


6 


.096 


. 192 


2.496 


4.992 


7 


.053 


.096 


1.378 


2.496 


8 


.026 


.043 


.676 


1. 118 


g 


.011 


.017 


. 286 


.442 


10 


.004 


.006 


. 104 


. 156 


11 


.001 


.002 


.026 


.052 



(9) Referring again to the curves, and specifically to the tabulated results set forth directly 
above, it wiU be seen that the probability that there will be exactly two occurrences of a speeified 
single letter in 100 letters of random text (.158), is less than the probability that there will be 
exactly three occurrences (.202) ; in other words, the chances that a specified single letter will 
occur exactly three times are better, by about 25 percent, than that it will occur only two times. 
Furthermore, there will be about five letters which will occur exactly 3 times, and about five 
which will occur exactly 4 times, whereas there will be only about two letters which will occur 
exactly 1 time. Other facts of a similar import may be deduced from the foregoing table. 

c. The discussion thus far has dealt with random assortments of letters. What about other 
types of texts, for example, normal plain text? What is the probability that E will occur 0, 1, 
2, 3, . . . times in 50 letters of normal English? The relative frequency value or probability 
that a letter selected at random from a large volume of normal English text will be E is .12604. 
(In 100,000 letters E occurred 12,604 times.) For 50 letters this value must be multiplied by 50, 
giving 6.3 as the mean or point to be found along the x axes of the curves. The probabilities for 
0, 1, 2, 3, . . . occurrences are tabulated below: 



60 letters of normal English plain text 



Frequency 

( I ) 


Probability that 
an E will be 
drawn exactly 
X times 


Probability that 
an E will be 
drawn at least 
X times 


0 


0.002 


1.000 


1 


.011 


.998 


2 


.036 


.987 


3 


.076 


.951 


4 


. 120 


.875 


5 


. 151 


.755 


6 


. 159 


.604 


7 


. 143 


.445 


8 


. 113 


. 302 


9 


.079 


.223 


10 


.050 


. 173 


11 


.029 


. 123 
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d. (1 ) It has been seen that the probability of occurrence of a specified single letter in random 

text employing a 26-letter alphabet is 0385. If a considerable volume of such text is 

written on a large sheet of paper and a pencil is directed at random toward this textj the probabil- 
ity that the pencil point will hit the letter A, or any other letter which may be specified in advance, 
is .0385. Now suppose two pencils are directed simultaneously toward the sheet of paper. The 

probability that both pencil points will hit two A’s is ^X^==^=. 00148, since in this case 

one is dealing with the probability of the simultaneous occurrence of two events which are 

independent. The probability of hitting two B’s, two C’s, . . two Z’s is likewise Hence, 

if no particular letter is specified, and merely this question is asked : “What is the probability 
that both pencil points will hit the same letter?” the answer must be the sum of the separate 
probabilities for simultaneously hitting two A’s, two B’s, and so on, for the whole alphabet, 

which is 26X^=^=.0385. This, then, is the probability that any two letters selected at random 



in random text of a 26-letter alphabet will be identical or will coincide. Since this value remains 
the same so long as the number of alphabetic elements remains fixed, it may be said that the 
probabHity of monographic coincidence in random text of a 26-demevt alphabet is .0385. The fore- 
going italicized expression ^ is important enough to warrant assigning a special symbol to it, viz, 
Kr (read “kappa sub-r”). For a 26-element alphabet, then, Xr==.0385. 

(2) Now if one asks: “Given a random assortment of 10 letters, what are the respective 
probabilities of occurrence of 0, 1, 2, . . . single-letter coincidences?” one proceeds as follows. 
As before, it is first necessary to find the mean or expected number of coincidences and then 
refer to the various probability curves. To find the mean, one reasons as follows. Given a 
sequence of 10 letters, one may begin with the 1st letter and compare it with the 2d, 3d, . . . 10th 
letter to see if any two letters coincide; 9 such comparisons may be made, or in other words there 
are, beginning with the 1st letter, 9 opportunities for the occurrence of a coincidence. But 
one may also start with the 2nd letter and compare it with the 3d, 4th . . . 10th letter, thus 
yielding 8 more opportunities for the occurrence of a coincidence, and so on. This process may 
continue until one reaches the 9th letter and compares it with the 10th, yielding but one oppor- 
tunity for the occurrence in question. The total number of comparisons that can be made is 
therefore the sum of the series of numbers 9, 8, 7, ... 1, which is 45 comparisons.® Since in 
the 10 letters there are 45 opportunities for coincidence of single letters, and since the probability 



* The expression itself may be termed a parameter, which in mathematics is often used to designate a constant 
that characterizes by each of its particular values some particular member of a system of values, functions, etc. 
The word is applicable in the case under discussion because the value obtained forK, is .0385; for a 25-element 
alphabet, k,= . 0400; for a 27-element alphabet, .0370, etc. 



• The number of comparisons may readily be found by the formula 



n(n— 1) 
2 



> where n is the total number 



of letters involved. This formula is merely a special case under the general formula for ascertaining the number 
of combinations that may be made of n different things taken r at a time, which is In the 

tl/ 

present case, since only two letters are compared at a time, r is always 2, and hence the expression 
which is the same as ^ . ^^2(^—2)! cancellation of the term (n— 2)/ reduced to "^”2 
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for monographic coincidence in random text is .0385 the expected number of coincidences is 
.0385X45=1.7325. With »i=1.7 one consults the various probability curves and an approxi- 
matedistributionfor exactlyandforatleastO, 1,2, . , . coincidences mayreaddy be ascertained.* 
e. (1) Now consider the matter of monographic coincidence in English plain text.* Follow- 
ing the same reasoning outlined in subpar. d (1), the probability of coincidence of two A’s m plain 
text is the square of the probability of occurrence of the single letter A in such text. The 
probability of coincidence of two B's is the square of the probability of occurrence of the single 
letter B, and so on. The sum of these squares for all the letters of the alphabet, as shown in 
the following table, is found to be .0667. 



Letter 


FreQuuncy ^ in 
IsOOO letters 


Probability ot sep- 
arate occuirence 
ol the letter 


Smare o{ proba- 
bility ol separate 
oocurrence 


A 


73. 68 




0. 0064 


B 


9. 74 




. 0001 


C .... 


30. 68 




. 0009 




42. 44 




. 0018 


E 


129. 96 




. 0169 


P , ... 


28. 32 




. 0008 


G. .... _... 


16. 38 




. 0003 


H . 


33. 88 




. 0012 




73. 62 


. 0735 


. 0054 




1. 64 


. 0016 


. 0000 


K 


2. 96. 




. 0000 


L 


36. 42 




. 0013 


M 


24. 74 




. 0006 


N . _ . 


79. 50 


. 0795 


. 0063 


0 


76. 28 




. 0057 


P 


26. 70 




. 0007 


q 


3. 50 




. 0000 


R 


75. 76 


. 0758 


. 0067 


s 


61. 16 




. 0087 


T . 


91. 90 


. 0919 


. 0084 


U._ . 


26. 00 


.0260 


. 0007 


V „ . 


15. 32 




. 0002 


w 


15. 60 


. 0156 


. 0002 


Y 


4. 62 




. 0000 


Y. 


19. 34 


. 0193 


. 0004 


Z 


. 98 




. 0000 


Total ..... 


1.000. 00 


1. 0000 


.0667 


1 The data given are taken from Table 3, Appendix 1, Military Cryptanalysis, Part I. 



This then is the probability that any two letters selected at random in a large volume of 
normal Engh^ telegraphic plain text will coincide. Since this value remains the same so long 
as the character of the language does not change radically, it may be said that the probability 
of momgraphic coincidence in English telegraphic plain text is .0667, or Kp=.0667. 

* The approximation given by the Poisson distribution in the case of single letters is not as good as that 
in the case of digraphs, trigraphs, etc-, discussed in paragraphs 3, 4, below. 

' The theory of monographic coincidence in plain text was originally developed and applied by the author 
in a technical paper written in 1925 dealing with his solution of messages enciphered by a cryptograph known 
as the “Hebern Electric Super-Code.” The paper was printed in 1934. 
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(2) Given 10 letters of English plain text^ what is the probability that there will be 0, 1, 

2, . . . single-letter coincidences? Following the line of reasoning in subparagraph d (2), the 
expected number of coincidences is .0667 X45 =3.00, or m=3. The distribution for exactly and 
for at least 0, 1, 2, . . . coincidences may readily be foimd by reference to the various probability 
curves. (See footnote 4.) 

/. The fact that (for English) is almost twice as great as Kr is of considerable importance 
in cryptanalysis. It bo dealt with in detail in a subsequent text. At this point it will mere- 
ly be said that Kp and k, for other languages and alphabets have been calculated and show con- 
siderable variation, as will be noted in the table shown in paragraph 3d. 

3. Data pertaining to digraphs. — a. (1) The foregoing discussion has been restricted to 
questions concerning single letters, but by slight modification it can be applied to questions 
concerning digraphs, trigraphs, end longer polygraphs. 

(2) In the preceding cases it was necessary, before referring to the various probability 
curves, to find the mean or expected number of occurrences of the event in question in the 
total number of cases or trials being considered. Given a piece of random text totalling 100 
letters, for example^ what is the mean (average, probable, expected) number of occurrences of 
digraphs in this text? Since there are 676 different digraphs, the probability of occurrence 

of any specified digraph is g^=.00148; since in 100 letters there are 99 digraphs (if the letters 

are taken consecutively in pairs) the mean or average number of occurrences in this case is 
.00148X99=. 147. Having the mean number of occurrences of the event under consideration, 
one may now find the answers to these questions: What is the probability that any specified 
digraph, say XY, will not occur? What is the probability that it will occur exactly 1, 2, 

3, . . . times? At least 1, 2, 3, . . . times? 

(3) Again the probability curves may be used as before, fof the type of distribution is the 
same. The following values are obtainable by reference to the various curves, using the mean 
value .00148X99=.147. 

100 letters of random text 



Frequency 

(*) 


Probability that 
a Bpeclfled digraph 
will occur exactly 
I times 


Frcbabllity that 
a specified dl^aph 
wul occur at least 
X times 


Probable number 
of digraphs ap- 
pearing exactly 
X times 


Probable number 
of digraphs ap- 
pearing at least 
X times 


0 


0. 86 


1.00 


581. 36 


676.00 


1 


. 13 


. 14 


87.88 


94.64 


2 


.01 


.01 


6.76 


6. 76 


3 


.00 


.00 


0. 00 


0.00 



(4) Thus it is seen that in 100 letters of random text the probability that a specified digraph 
will occur exactly once, for example, is . 13 ; at least once, .14 ; at least twice, .01 . The probability 
that a specified digraph will occur at least 3 times is negligible. (By calculation, it is found to 
to be .0005.) 

h. (1) The probability of digraphic coincidence in random text based upon a 26-element 
alphabet is of course quite simply obtained : since there are 26* different digraphs, the probability 

of selecting any specified digraph in random text is The probability of selecting tWo iden- 
tical digraphs in such text, when the digraphs are specified, is Since there are 26* 

different digraphs, the probability of digraphic coincidence in random text, «,*, is 26*X^=^= 

.00148. 
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(2) Given a random assortment of 100 letters, what is the probability of occurrence of 
0, 1, 2, . . . digraphic coincidences? FoDowing the line of reasoning in paragraph 2d (2), in 
100 letters the total number of comparisons that may be made to see if two digraphs coincide 
is 4,851. This number is obtained as follows; Consider the 1st and 2d letters in the series of 
100 letters; they may be combined to from a digraph to be compared with the digraphs formed 
by combining the 2d and 3d, the 3d and 4th, the 4th and 5th letters, and so on, giving a total of 
98 comparisons. Consider the digraph formed by combining the 2d and 3d letters; it may be 
compared with the digraphs formed by combining the 3d and 4th, 4th and 5th letters, and so on, 
giving a total of 97 comparisons. This process may be continued down to the digraph formed 
by combining the 98th and 99th letters, which yields only one comparison, since it may be 
compared only with the digraph resulting from combining the 99th and 100th letters. The 
total number of comparisons is the sum of the sequence of numbers 98, 97, 96, 95, ... 1, which 
is 4,851. ‘ 

(3) Since in the 100 letters there are 4,851 opportunities for the occurrence of a digraphic 
coincidence, and since k,' =. 00148, the expected number of coincidences is .00148X4851 = 
7.17948=7.2. The various probability curves may now be referred to and the following results 
are obtained: 



Distribution for 100 letters of random text 



Fr«quenc7 (i) 


Probability for exactly r 
digrapbio coincidences 


Probability for at least x 
digraphlc coincidences 


0 


a 001 


1.000 


1 


.005 


.999 


2 


. 019 


.964 


3 


.046 


.975 


4 


.083 


.929 


S 


. 120 


. 846 


6 


. 144 


.726 


7 


. 148 


.582 


8 


. 134 


.434 


9 


. 107 


.300 


10 


.077 


. 163 


11 


.050 


. 116 



e. In this table it wiU be noted that it is almost certain that in 100 letters of random text 
there will be at least one digraphic coincidence, despite the fact that there are 676 possible 
digraphs and only 99 of them have appeared in 100 letters. When one thinks of a total of 676 
different digraphs from which the 99 digraphs may be selected it may appear rather incredible 
that the chances are better than even (.582) that one will find at least 7 digraphic coincidences in 
100 letters of random text, yet that is what the statistical analysis of the probleni shows to be 
the case. These are, of course, purely accidental repetitions. It is important that the studont 
should fully realize that more coincidences or accidental repetitions than he feels intuitively 
should occur in random text will actually occur in the cryptograms he will study. He must 
therefore be on guard against putting too much reliance upon the surface appearances of the 
phenomena of repetition; he must calculate what may be expected from pure chance, to make 
sure that the number and length of the repetitions he does see in a cryptogram are really better 
than what may be expected in random text. In studying cryptograms composed of figures this 

* The formula for finding the number of comparisoDS that can be made is as follows, where n= the total 
number of letters in the sequenee and ( is the length of the polygraph; No- of comparisons— 
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is veiy important, for as the muuber of different s]rmbols decreases the probability for ptueiy 
chance coincidences increases. 

d. (1) For convenience the following values of the reciprocals of various numbers from 20 to 
36, and of the reciprocals of the squares, cubes, and 4th powers of these numbers are listed: 



X 


V* 


i/i* 


1/H 


1/X* 


20 




0. 002500 


0. 000125 


0. 00000625 


21 


. 0476 


. 002266 


. 000108 


.00000514 


22 




. 002070 


.000094 


. 00000420 


23 




. 001892 


.000082 


. 00000358 


24 




. 001739 


. 000073 


. 00000302 


25 




. 001600 


. 000064 


. 00000256 


26 




. 001482 


.000057 


. 00000220 


27 




. 001369 


.000051 


. 00000187 


28 


^BEt 


. 001274 


. 000046 


. 00000162 


29 




,001190 


. 000041 


. 00000142 


30 


^BE'^I 


. 001109 


, . 000037 


. 00000123 


31 


^BE? 


. 001043 


. 000034 


. 00000109 


32 




. 000980 


. 000031 


. 00000096 


33 


^BE : 


. 000918 


. 000028 


. 00000084 


34 




. 000864 


. 000025 


. 00000075 


35 




. 000818 


. 000023 


. 00000067 


36 


^BE'- 


. 000773 


. 000021 


. 00000060 



(2) The following table gives the probabilities for monographic and digraphic coincidence 
for plain-text in several languages. 



Language 


*1. 


Kpl 


English _ .. 


0.0667 


0. 0069 


French 


.0778 


.0093 


German . 


. 0762 


.0112 


Italian . 


.0738 


. 0081 


Spanish 


.0775 


.0093 



4. Data pertaining to trigraphs, etc. — a. Enough has been shown to make clear to the student 
how to calculate probability data concerning tr^aphs, tetragraphs, and longer polygraphs. 

b. (1) For example, in 100 letters of random text the value of m (the mean) for trigraphs 
is .00005689X100 =.005689. With so small a value, the probability curves are hardly usable, 
but at any rate they show that the probability of occurrence of a specified trigraph in so small 
a volume of text is so small as to be practically negligible. The probability of a specified trigraph 
occurring twice in that text is an even smaller quantity. 

(2) The calculation for finding the probability of at least one trigraphic coincidence in 100 
letters of random text is as follows; 

m = ^ .00005689 12 = .2704 = .27 

Referring to curveyo, with m=.27 the probability of finding no trigraphic coincidence is .76. 
The probability of finding at least one trigraphic coincidence is therefore 1— .76=.24. 

c. The calculation for a tetragrapliic coincidence is as follows: 

m=:^?®^^^X^=4,656X.0000021883=.0101=.01 

Referring to curve /o, with m=.01 the probability of finding no tetragraphio ooincidenc* » 
so high as to amount almost to certainty. Consequently, the probability of finding at least 

1820W-38 9 
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one tetragraphic coincidence is practically nil. (It is calculated to be .0094 = approximately .01 . 
This means that in a hundred cases of 100-letter random-text cryptograms, one might expect 
to find but one cryptogram in which a 4-letter repetition is brought about purely by chance; it 
is, in common parlance, a “himdred to one shot.”) Consequently, if a tetragraphic repetition 
is found in a cryptogram of 100 letters, the probability that it is an accidental repetition is 
extremely small. If not accidental, then it must be causal, and the cause should be ascertained. 

6. An example. — o. The message of Par. 9a of the text proper will be employed. First, let 
the repetitions be sought and underlined; then the repetitions are listed for convenience. 



A. 


U S Y E 


S 


E 


C 


P 


M 


P 


L 


C 


C 


L N 


X 


B 


W 


C 


S 


0 


X 


U 


V D 


B. 


S C R H 


T 


H 


X 


I 


P. 


ij 


I 


B. 


JD 


I J 


u 


S 


X 


.E 


E 


G 


u 


R 


D P 


C. 


A y B C 


X 


0 


F 


P 


J 


w 


J 


E 


M 


G P 


X 


V 


E 


U 


E 


L. 


J 


J. 


Y Q 


D. 


M U S C 


X 


J 


Y 


M 


S 


G 


L 


L 


E 


T A 


L 


E 


D 


E 


C 


G 


B 


M 


F I 



Group 


Number of 
occurrences 


BC 


2 


CX 


2 


EC 


2 


LE 


3 


jy 


2 


PL 


2 


sc 


2 


SY 


2 


US 


3 


YE 


2 


SYE 


2 


USY 


2 


USYE 


2 



b. Referring to the table in Par. 3a (3) above, it will be seen that in 100 letters of random 
text one might expect to find about 7 digraphs appearing at least twice and no digi-aph appearing 
3 times. The list of repetitions shows 8 digraphs occurring twice and 2 occurring 3 times. 

c. Again, the list of repetitions shows 10 digraphs each repeated at least twice; the table in 
Par. 3b (3) above shows that in 100 letters of random text the probability of finding at least 
that many digraphic comcidences is only .193. That is, the chances of this being an accident are 
but 176 in a thousand; or another way of expressing the some thing is to say that the odds against 
this phenomenon being an accident are as 807 is to 193 or roughly 4 to 1. 

d. The probability of finding at least one tiigraphic coincidence in 100 letters of random 
text is very small, as noted in Par. 4b; the probability of finding at least one tetragraphic coin- 
cidence is still smaller (Par. 4c). Yet this cipher message of but 100 letters contains a repetition 
of this length. 

e. A consideration of the foregoing leads to the conclusion that the number and length of the 
repetitions manifested by the cryptogram are not accidental, such as might be expected to occur 
in random text of the same length; hence they must be causal in their origin. The cause in this 
case is not difficult to find: repeated isolated letters and repeated sequences of letters (digraphs, 
ttigraphs) in the plain text were actually enciphered by identical alphabets, resulting in producing 
repeated letters and sequences in the cipher text. 
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